This project analyzes the COVID-19 dataset to
uncover insights into global trends, case spikes, mortality rates, and
vaccination progress.
We perform data preprocessing, exploratory analysis (EDA), and
visualization to understand key patterns.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from scipy.stats import zscore
import math
from scipy.stats.mstats import winsorize
from google.colab import drive
drive.mount('/content/drive')
df = pd.read_csv('/content/drive/My Drive/Data Sets/covid-data.csv')Mounted at /content/drive
Note: Some the output of notebook does not present the complete output, therefore we can increase the limit of columns view and row view by using these commands:
pd.set_option('display.max_columns', None) # this is to display all the columns in the dataframe
pd.set_option('display.max_rows', None) # this is to display all the rows in the dataframe# hide all warnings runtime
import warnings
warnings.filterwarnings('ignore')# Display the first few rows
df.head()| iso_code | continent | location | date | total_cases | new_cases | new_cases_smoothed | total_deaths | new_deaths | new_deaths_smoothed | total_cases_per_million | new_cases_per_million | new_cases_smoothed_per_million | total_deaths_per_million | new_deaths_per_million | new_deaths_smoothed_per_million | reproduction_rate | icu_patients | icu_patients_per_million | hosp_patients | hosp_patients_per_million | weekly_icu_admissions | weekly_icu_admissions_per_million | weekly_hosp_admissions | weekly_hosp_admissions_per_million | total_tests | new_tests | total_tests_per_thousand | new_tests_per_thousand | new_tests_smoothed | new_tests_smoothed_per_thousand | positive_rate | tests_per_case | tests_units | total_vaccinations | people_vaccinated | people_fully_vaccinated | total_boosters | new_vaccinations | new_vaccinations_smoothed | total_vaccinations_per_hundred | people_vaccinated_per_hundred | people_fully_vaccinated_per_hundred | total_boosters_per_hundred | new_vaccinations_smoothed_per_million | new_people_vaccinated_smoothed | new_people_vaccinated_smoothed_per_hundred | stringency_index | population_density | median_age | aged_65_older | aged_70_older | gdp_per_capita | extreme_poverty | cardiovasc_death_rate | diabetes_prevalence | female_smokers | male_smokers | handwashing_facilities | hospital_beds_per_thousand | life_expectancy | human_development_index | population | excess_mortality_cumulative_absolute | excess_mortality_cumulative | excess_mortality | excess_mortality_cumulative_per_million | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AFG | Asia | Afghanistan | 2020-01-03 | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 54.422 | 18.6 | 2.581 | 1.337 | 1803.987 | NaN | 597.029 | 9.59 | NaN | NaN | 37.746 | 0.5 | 64.83 | 0.511 | 41128772.0 | NaN | NaN | NaN | NaN |
| 1 | AFG | Asia | Afghanistan | 2020-01-04 | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 54.422 | 18.6 | 2.581 | 1.337 | 1803.987 | NaN | 597.029 | 9.59 | NaN | NaN | 37.746 | 0.5 | 64.83 | 0.511 | 41128772.0 | NaN | NaN | NaN | NaN |
| 2 | AFG | Asia | Afghanistan | 2020-01-05 | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 54.422 | 18.6 | 2.581 | 1.337 | 1803.987 | NaN | 597.029 | 9.59 | NaN | NaN | 37.746 | 0.5 | 64.83 | 0.511 | 41128772.0 | NaN | NaN | NaN | NaN |
| 3 | AFG | Asia | Afghanistan | 2020-01-06 | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 54.422 | 18.6 | 2.581 | 1.337 | 1803.987 | NaN | 597.029 | 9.59 | NaN | NaN | 37.746 | 0.5 | 64.83 | 0.511 | 41128772.0 | NaN | NaN | NaN | NaN |
| 4 | AFG | Asia | Afghanistan | 2020-01-07 | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 54.422 | 18.6 | 2.581 | 1.337 | 1803.987 | NaN | 597.029 | 9.59 | NaN | NaN | 37.746 | 0.5 | 64.83 | 0.511 | 41128772.0 | NaN | NaN | NaN | NaN |
df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 302512 entries, 0 to 302511
Data columns (total 67 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 iso_code 302512 non-null object
1 continent 288160 non-null object
2 location 302512 non-null object
3 date 302512 non-null object
4 total_cases 266771 non-null float64
5 new_cases 294064 non-null float64
6 new_cases_smoothed 292800 non-null float64
7 total_deaths 246214 non-null float64
8 new_deaths 294139 non-null float64
9 new_deaths_smoothed 292909 non-null float64
10 total_cases_per_million 266771 non-null float64
11 new_cases_per_million 294064 non-null float64
12 new_cases_smoothed_per_million 292800 non-null float64
13 total_deaths_per_million 246214 non-null float64
14 new_deaths_per_million 294139 non-null float64
15 new_deaths_smoothed_per_million 292909 non-null float64
16 reproduction_rate 184817 non-null float64
17 icu_patients 34764 non-null float64
18 icu_patients_per_million 34764 non-null float64
19 hosp_patients 35138 non-null float64
20 hosp_patients_per_million 35138 non-null float64
21 weekly_icu_admissions 9101 non-null float64
22 weekly_icu_admissions_per_million 9101 non-null float64
23 weekly_hosp_admissions 21287 non-null float64
24 weekly_hosp_admissions_per_million 21287 non-null float64
25 total_tests 79387 non-null float64
26 new_tests 75403 non-null float64
27 total_tests_per_thousand 79387 non-null float64
28 new_tests_per_thousand 75403 non-null float64
29 new_tests_smoothed 103965 non-null float64
30 new_tests_smoothed_per_thousand 103965 non-null float64
31 positive_rate 95927 non-null float64
32 tests_per_case 94348 non-null float64
33 tests_units 106788 non-null object
34 total_vaccinations 73561 non-null float64
35 people_vaccinated 70411 non-null float64
36 people_fully_vaccinated 68149 non-null float64
37 total_boosters 42324 non-null float64
38 new_vaccinations 60542 non-null float64
39 new_vaccinations_smoothed 163536 non-null float64
40 total_vaccinations_per_hundred 73561 non-null float64
41 people_vaccinated_per_hundred 70411 non-null float64
42 people_fully_vaccinated_per_hundred 68149 non-null float64
43 total_boosters_per_hundred 42324 non-null float64
44 new_vaccinations_smoothed_per_million 163536 non-null float64
45 new_people_vaccinated_smoothed 163587 non-null float64
46 new_people_vaccinated_smoothed_per_hundred 163587 non-null float64
47 stringency_index 193194 non-null float64
48 population_density 256703 non-null float64
49 median_age 238751 non-null float64
50 aged_65_older 230391 non-null float64
51 aged_70_older 236359 non-null float64
52 gdp_per_capita 233979 non-null float64
53 extreme_poverty 150700 non-null float64
54 cardiovasc_death_rate 234406 non-null float64
55 diabetes_prevalence 246348 non-null float64
56 female_smokers 175815 non-null float64
57 male_smokers 173423 non-null float64
58 handwashing_facilities 114817 non-null float64
59 hospital_beds_per_thousand 206911 non-null float64
60 life_expectancy 278219 non-null float64
61 human_development_index 227212 non-null float64
62 population 302512 non-null float64
63 excess_mortality_cumulative_absolute 10295 non-null float64
64 excess_mortality_cumulative 10295 non-null float64
65 excess_mortality 10295 non-null float64
66 excess_mortality_cumulative_per_million 10295 non-null float64
dtypes: float64(62), object(5)
memory usage: 154.6+ MB
df.sample(50)| iso_code | continent | location | date | total_cases | new_cases | new_cases_smoothed | total_deaths | new_deaths | new_deaths_smoothed | total_cases_per_million | new_cases_per_million | new_cases_smoothed_per_million | total_deaths_per_million | new_deaths_per_million | new_deaths_smoothed_per_million | reproduction_rate | icu_patients | icu_patients_per_million | hosp_patients | hosp_patients_per_million | weekly_icu_admissions | weekly_icu_admissions_per_million | weekly_hosp_admissions | weekly_hosp_admissions_per_million | total_tests | new_tests | total_tests_per_thousand | new_tests_per_thousand | new_tests_smoothed | new_tests_smoothed_per_thousand | positive_rate | tests_per_case | tests_units | total_vaccinations | people_vaccinated | people_fully_vaccinated | total_boosters | new_vaccinations | new_vaccinations_smoothed | total_vaccinations_per_hundred | people_vaccinated_per_hundred | people_fully_vaccinated_per_hundred | total_boosters_per_hundred | new_vaccinations_smoothed_per_million | new_people_vaccinated_smoothed | new_people_vaccinated_smoothed_per_hundred | stringency_index | population_density | median_age | aged_65_older | aged_70_older | gdp_per_capita | extreme_poverty | cardiovasc_death_rate | diabetes_prevalence | female_smokers | male_smokers | handwashing_facilities | hospital_beds_per_thousand | life_expectancy | human_development_index | population | excess_mortality_cumulative_absolute | excess_mortality_cumulative | excess_mortality | excess_mortality_cumulative_per_million | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 82258 | SWZ | Africa | Eswatini | 2022-10-11 | 73436.0 | 0.0 | 3.714 | 1422.0 | 0.0 | 0.000 | 61111.111 | 0.000 | 3.091 | 1183.343 | 0.000 | 0.000 | 0.11 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | NaN | NaN | NaN | NaN | 0.0 | 198.0 | 0.016 | 23.15 | 79.492 | 21.5 | 3.163 | 1.845 | 7738.975 | NaN | 333.436 | 3.94 | 1.700 | 16.500 | 24.097 | 2.100 | 60.19 | 0.611 | 1.201680e+06 | NaN | NaN | NaN | NaN |
| 31128 | BOL | South America | Bolivia | 2020-02-02 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.00 | 10.202 | 25.4 | 6.704 | 4.393 | 6885.829 | 7.1 | 204.299 | 6.89 | NaN | NaN | 25.383 | 1.100 | 71.51 | 0.718 | 1.222411e+07 | NaN | NaN | NaN | NaN |
| 95446 | GAB | Africa | Gabon | 2022-11-12 | 48959.0 | 0.0 | 0.000 | 306.0 | 0.0 | 0.000 | 20493.538 | 0.000 | 0.000 | 128.087 | 0.000 | 0.000 | 0.10 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 102.0 | NaN | NaN | NaN | NaN | 43.0 | 7.0 | 0.000 | 11.11 | 7.859 | 23.1 | 4.450 | 2.976 | 16562.413 | 3.4 | 259.967 | 7.20 | NaN | NaN | NaN | 6.300 | 66.47 | 0.703 | 2.388997e+06 | NaN | NaN | NaN | NaN |
| 211644 | PER | South America | Peru | 2022-11-10 | 4163326.0 | 1557.0 | 822.429 | 217103.0 | 5.0 | 12.857 | 122272.434 | 45.727 | 24.154 | 6376.083 | 0.147 | 0.378 | 1.45 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8.547293e+07 | 3.004870e+07 | 2.831865e+07 | 27105585.0 | 28141.0 | 21892.0 | 251.02 | 88.25 | 83.17 | 79.61 | 643.0 | 4200.0 | 0.012 | 11.11 | 25.129 | 29.1 | 7.151 | 4.455 | 12236.706 | 3.5 | 85.755 | 5.95 | 4.800 | NaN | NaN | 1.600 | 76.74 | 0.777 | 3.404959e+07 | NaN | NaN | NaN | NaN |
| 194617 | PRK | Asia | North Korea | 2020-07-10 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 29.0 | 0.001 | NaN | NaN | samples tested | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 211.701 | 35.3 | 9.491 | 6.139 | NaN | NaN | 321.681 | 4.00 | NaN | NaN | NaN | 13.200 | 72.27 | NaN | 2.606942e+07 | NaN | NaN | NaN | NaN |
| 78145 | GNQ | Africa | Equatorial Guinea | 2021-05-04 | 7694.0 | 0.0 | 19.286 | 112.0 | 0.0 | 0.714 | 4593.663 | 0.000 | 11.514 | 66.869 | 0.000 | 0.426 | 0.47 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4766.0 | NaN | NaN | NaN | NaN | 2846.0 | 3239.0 | 0.193 | NaN | 45.194 | 22.4 | 2.846 | 1.752 | 22604.873 | NaN | 202.812 | 7.78 | NaN | NaN | 24.640 | 2.100 | 58.74 | 0.592 | 1.674916e+06 | NaN | NaN | NaN | NaN |
| 250127 | ZAF | Africa | South Africa | 2020-05-12 | 10652.0 | 637.0 | 490.286 | 206.0 | 12.0 | 9.714 | 177.848 | 10.635 | 8.186 | 3.439 | 0.200 | 0.162 | 1.44 | 70.0 | 1.169 | 434.0 | 7.246 | NaN | NaN | NaN | NaN | 369697.0 | 13630.0 | 6.225 | 0.229 | 14519.0 | 0.244 | 0.0372 | 26.9 | people tested | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 84.26 | 46.754 | 27.3 | 5.344 | 3.053 | 12294.876 | 18.9 | 200.380 | 5.52 | 8.100 | 33.200 | 43.993 | 2.320 | 64.13 | 0.709 | 5.989388e+07 | NaN | NaN | NaN | NaN |
| 197311 | OWID_CYN | Asia | Northern Cyprus | 2022-05-20 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 655.0 | NaN | NaN | NaN | NaN | 1711.0 | 46.0 | 0.012 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.828360e+05 | NaN | NaN | NaN | NaN |
| 278666 | TCA | North America | Turks and Caicos Islands | 2023-03-23 | 6565.0 | 0.0 | 0.429 | 38.0 | 0.0 | 0.000 | 143572.585 | 0.000 | 9.373 | 831.037 | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 37.312 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 80.22 | NaN | 4.572600e+04 | NaN | NaN | NaN | NaN |
| 235960 | SAU | Asia | Saudi Arabia | 2020-09-01 | 315772.0 | 951.0 | 1016.857 | 3897.0 | 27.0 | 29.429 | 8672.952 | 26.120 | 27.929 | 107.034 | 0.742 | 0.808 | 0.83 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 5393167.0 | 55801.0 | 150.017 | 1.552 | 54944.0 | 1.528 | 0.0179 | 55.7 | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 60.19 | 15.322 | 31.9 | 3.295 | 1.845 | 49045.411 | NaN | 259.538 | 17.72 | 1.800 | 25.400 | NaN | 2.700 | 75.13 | 0.854 | 3.640882e+07 | NaN | NaN | NaN | NaN |
| 142214 | LAO | Asia | Laos | 2020-01-06 | NaN | 0.0 | NaN | NaN | 0.0 | NaN | NaN | 0.000 | NaN | NaN | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 29.715 | 24.4 | 4.029 | 2.322 | 6397.360 | 22.7 | 368.111 | 4.00 | 7.300 | 51.200 | 49.839 | 1.500 | 67.92 | 0.613 | 7.529477e+06 | NaN | NaN | NaN | NaN |
| 260989 | CHE | Europe | Switzerland | 2020-08-19 | 38873.0 | 294.0 | 231.000 | 1762.0 | 0.0 | 0.714 | 4447.472 | 33.637 | 26.429 | 201.591 | 0.000 | 0.082 | 1.19 | 33.0 | 3.776 | 123.0 | 14.072 | NaN | NaN | 57.0 | 6.521 | 540718.0 | 9388.0 | 62.213 | 1.080 | 7136.0 | 0.821 | 0.0360 | 27.8 | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 43.06 | 214.243 | 43.1 | 18.436 | 12.644 | 57410.166 | NaN | 99.739 | 5.59 | 22.600 | 28.900 | NaN | 4.530 | 83.78 | 0.955 | 8.740471e+06 | NaN | NaN | NaN | NaN |
| 42232 | BDI | Africa | Burundi | 2021-01-07 | 885.0 | 22.0 | 9.571 | 2.0 | 0.0 | 0.000 | 68.660 | 1.707 | 0.743 | 0.155 | 0.000 | 0.000 | 1.18 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 11.11 | 423.062 | 17.5 | 2.562 | 1.504 | 702.225 | 71.7 | 293.068 | 6.05 | NaN | NaN | 6.144 | 0.800 | 61.58 | 0.433 | 1.288958e+07 | NaN | NaN | NaN | NaN |
| 47644 | CPV | Africa | Cape Verde | 2022-09-27 | 62368.0 | 8.0 | 1.714 | 410.0 | 0.0 | 0.000 | 105144.969 | 13.487 | 2.890 | 691.211 | 0.000 | 0.000 | 0.82 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 19.44 | 135.580 | 25.7 | 4.460 | 3.437 | 6222.554 | NaN | 182.219 | 2.42 | 2.100 | 16.500 | NaN | 2.100 | 72.98 | 0.665 | 5.931620e+05 | NaN | NaN | NaN | NaN |
| 290731 | VUT | Oceania | Vanuatu | 2020-03-27 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 83.33 | 22.662 | 23.1 | 4.394 | 2.620 | 2921.909 | 13.2 | 546.300 | 12.02 | 2.800 | 34.500 | 25.209 | NaN | 70.47 | 0.609 | 3.267440e+05 | NaN | NaN | NaN | NaN |
| 158060 | MWI | Africa | Malawi | 2021-12-13 | 62265.0 | 35.0 | 40.571 | 2308.0 | 1.0 | 0.143 | 3051.410 | 1.715 | 1.988 | 113.108 | 0.049 | 0.007 | 2.50 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1138.0 | 0.057 | 0.0481 | 20.8 | tests performed | 1.560093e+06 | 1.276252e+06 | 6.349090e+05 | NaN | 6746.0 | 11361.0 | 7.65 | 6.25 | 3.11 | NaN | 557.0 | 11121.0 | 0.055 | 39.81 | 197.519 | 18.1 | 2.979 | 1.783 | 1095.042 | 71.4 | 227.349 | 3.94 | 4.400 | 24.700 | 8.704 | 1.300 | 64.26 | 0.483 | 2.040532e+07 | NaN | NaN | NaN | NaN |
| 106844 | GUM | Oceania | Guam | 2021-04-29 | 7733.0 | 12.0 | 7.714 | 136.0 | 0.0 | 0.000 | 45016.096 | 69.856 | 44.907 | 791.697 | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 111205.0 | 147.0 | 652.099 | 0.862 | 1046.0 | 6.134 | 0.0060 | 166.7 | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 304.128 | 31.4 | 9.551 | 5.493 | NaN | NaN | 310.496 | 21.52 | NaN | NaN | NaN | NaN | 80.07 | NaN | 1.717830e+05 | NaN | NaN | NaN | NaN |
| 83681 | OWID_EUR | NaN | Europe | 2020-02-15 | 86.0 | 1.0 | 2.857 | 2.0 | 2.0 | 0.286 | 0.115 | 0.001 | 0.004 | 0.003 | 0.003 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 7.448078e+08 | NaN | NaN | NaN | NaN |
| 25798 | BEL | Europe | Belgium | 2021-11-13 | 1479519.0 | 5053.0 | 9680.143 | 26470.0 | 32.0 | 26.143 | 126932.805 | 433.514 | 830.491 | 2270.948 | 2.745 | 2.243 | 1.40 | 489.0 | 41.953 | 2407.0 | 206.504 | NaN | NaN | 1567.0 | 134.438 | 22979081.0 | 88197.0 | 1979.007 | 7.596 | 80767.0 | 6.956 | 0.1370 | 7.3 | tests performed | 1.794216e+07 | 8.820783e+06 | 8.676959e+06 | 864824.0 | 21596.0 | 23213.0 | 153.93 | 75.68 | 74.44 | 7.42 | 1992.0 | 3579.0 | 0.031 | 31.92 | 375.564 | 41.8 | 18.571 | 12.849 | 42658.576 | 0.2 | 114.898 | 4.29 | 25.100 | 31.400 | NaN | 5.640 | 81.63 | 0.931 | 1.165592e+07 | NaN | NaN | NaN | NaN |
| 258743 | SUR | South America | Suriname | 2021-01-12 | 7008.0 | 60.0 | 87.857 | 133.0 | 1.0 | 1.429 | 11338.962 | 97.080 | 142.153 | 215.194 | 1.618 | 2.311 | 1.10 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 39.0 | 0.064 | NaN | NaN | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 67.59 | 3.612 | 29.6 | 6.933 | 4.229 | 13767.119 | NaN | 258.314 | 12.54 | 7.400 | 42.900 | 67.779 | 3.100 | 71.68 | 0.738 | 6.180460e+05 | NaN | NaN | NaN | NaN |
| 196431 | MKD | Europe | North Macedonia | 2022-03-20 | 303354.0 | 214.0 | 266.143 | 9184.0 | 8.0 | 4.571 | 144895.458 | 102.216 | 127.122 | 4386.690 | 3.821 | 2.184 | 0.86 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2134.0 | 1.015 | 0.1247 | 8.0 | tests performed | 1.836421e+06 | NaN | 8.351510e+05 | 148686.0 | NaN | 485.0 | 87.72 | NaN | 39.89 | 7.10 | 232.0 | 14.0 | 0.001 | NaN | 82.600 | 39.1 | 13.260 | 8.160 | 13111.214 | 5.0 | 322.688 | 10.08 | NaN | NaN | NaN | 4.280 | 75.80 | 0.774 | 2.093606e+06 | NaN | NaN | NaN | NaN |
| 37900 | VGB | North America | British Virgin Islands | 2022-04-04 | 6141.0 | 0.0 | 5.286 | 62.0 | 0.0 | 0.000 | 195997.702 | 0.000 | 168.700 | 1978.808 | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 11.0 | NaN | NaN | NaN | NaN | 351.0 | 4.0 | 0.013 | NaN | 207.973 | NaN | NaN | NaN | NaN | NaN | NaN | 13.67 | NaN | NaN | NaN | NaN | 79.07 | NaN | 3.133200e+04 | NaN | NaN | NaN | NaN |
| 232443 | WSM | Oceania | Samoa | 2020-11-11 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 69.413 | 22.0 | 5.606 | 3.564 | 6021.557 | NaN | 348.977 | 9.21 | 16.700 | 38.100 | NaN | NaN | 73.32 | 0.715 | 2.223900e+05 | NaN | NaN | NaN | NaN |
| 279616 | TUV | Oceania | Tuvalu | 2022-07-20 | 8.0 | 0.0 | 0.000 | NaN | 0.0 | 0.000 | 705.779 | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 220.0 | NaN | NaN | NaN | NaN | 19409.0 | 5.0 | 0.044 | NaN | 373.067 | NaN | NaN | NaN | 3575.104 | 3.3 | NaN | 27.25 | NaN | NaN | NaN | NaN | 67.57 | NaN | 1.133500e+04 | NaN | NaN | NaN | NaN |
| 12009 | ARM | Asia | Armenia | 2020-02-19 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 73.0 | 5.0 | 0.026 | 0.002 | 1.0 | 0.000 | 0.0000 | NaN | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 102.931 | 35.7 | 11.232 | 7.571 | 8787.580 | 1.8 | 341.010 | 7.11 | 1.500 | 52.100 | 94.043 | 4.200 | 75.09 | 0.776 | 2.780472e+06 | NaN | NaN | NaN | NaN |
| 35803 | BWA | Africa | Botswana | 2023-01-24 | 329214.0 | 0.0 | 45.143 | 2792.0 | 0.0 | 0.286 | 125162.149 | 0.000 | 17.163 | 1061.476 | 0.000 | 0.109 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1444.0 | NaN | NaN | NaN | NaN | 549.0 | 307.0 | 0.012 | NaN | 4.044 | 25.8 | 3.941 | 2.242 | 15807.374 | NaN | 237.372 | 4.81 | 5.700 | 34.400 | NaN | 1.800 | 69.59 | 0.735 | 2.630300e+06 | NaN | NaN | NaN | NaN |
| 64200 | CUW | North America | Curacao | 2022-03-23 | 39853.0 | 293.0 | 41.857 | 265.0 | 0.0 | 0.000 | 208465.631 | 1532.643 | 218.949 | 1386.179 | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 13.0 | 0.068 | NaN | NaN | tests performed | 2.462320e+05 | 1.074000e+05 | 9.844100e+04 | 40391.0 | 79.0 | 60.0 | 128.80 | 56.18 | 51.49 | 21.13 | 314.0 | 12.0 | 0.006 | NaN | 362.644 | 41.7 | 16.367 | 10.068 | NaN | NaN | NaN | 11.62 | NaN | NaN | NaN | NaN | 78.88 | NaN | 1.911730e+05 | NaN | NaN | NaN | NaN |
| 79444 | ERI | Africa | Eritrea | 2021-08-15 | 6601.0 | 1.0 | 3.571 | 37.0 | 1.0 | 0.286 | 1791.782 | 0.271 | 0.969 | 10.043 | 0.271 | 0.078 | 0.80 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 44.304 | 19.3 | 3.607 | 2.171 | 1510.459 | NaN | 311.110 | 6.05 | 0.200 | 11.400 | NaN | 0.700 | 66.32 | 0.459 | 3.684041e+06 | NaN | NaN | NaN | NaN |
| 20160 | BHS | North America | Bahamas | 2022-10-21 | 37342.0 | 0.0 | 1.714 | 833.0 | 0.0 | 0.000 | 91080.492 | 0.000 | 4.181 | 2031.762 | 0.000 | 0.000 | 0.50 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.629680e+05 | 1.735290e+05 | 1.653170e+05 | 35557.0 | NaN | 41.0 | 88.53 | 42.33 | 40.32 | 8.67 | 100.0 | 15.0 | 0.004 | 20.37 | 39.497 | 34.3 | 8.996 | 5.200 | 27717.847 | NaN | 235.954 | 13.17 | 3.100 | 20.400 | NaN | 2.900 | 73.92 | 0.814 | 4.099890e+05 | NaN | NaN | NaN | NaN |
| 227361 | LCA | North America | Saint Lucia | 2020-01-18 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 293.187 | 34.9 | 9.721 | 6.405 | 12951.839 | NaN | 204.620 | 11.62 | NaN | NaN | 87.202 | 1.300 | 76.20 | 0.759 | 1.798720e+05 | NaN | NaN | NaN | NaN |
| 112117 | GNB | Africa | Guinea-Bissau | 2022-08-31 | 8796.0 | 0.0 | 43.571 | 175.0 | 0.0 | 0.000 | 4177.471 | 0.000 | 20.693 | 83.112 | 0.000 | 0.000 | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8.0 | NaN | NaN | NaN | NaN | 4.0 | 8.0 | 0.000 | NaN | 66.191 | 19.4 | 3.002 | 1.565 | 1548.675 | 67.1 | 382.474 | 2.42 | NaN | NaN | 6.403 | NaN | 58.32 | 0.480 | 2.105580e+06 | NaN | NaN | NaN | NaN |
| 40681 | BFA | Africa | Burkina Faso | 2020-01-18 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 70.151 | 17.6 | 2.409 | 1.358 | 1703.102 | 43.7 | 269.048 | 2.42 | 1.600 | 23.900 | 11.877 | 0.400 | 61.58 | 0.452 | 2.267376e+07 | NaN | NaN | NaN | NaN |
| 137703 | KIR | Oceania | Kiribati | 2020-10-05 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 22.22 | 143.701 | 23.2 | 3.895 | 2.210 | 1981.132 | NaN | 434.657 | 22.66 | 35.900 | 58.900 | NaN | 1.900 | 68.37 | 0.630 | 1.312370e+05 | NaN | NaN | NaN | NaN |
| 97401 | GEO | Asia | Georgia | 2021-09-01 | 553697.0 | 3886.0 | 3664.857 | 7482.0 | 74.0 | 76.143 | 147873.950 | 1037.821 | 978.761 | 1998.192 | 19.763 | 20.335 | 0.86 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 6491731.0 | 52162.0 | 1727.452 | 13.880 | 39147.0 | 10.417 | 0.0936 | 10.7 | tests performed | 1.235386e+06 | 8.136010e+05 | 4.217850e+05 | NaN | 24640.0 | 24220.0 | 32.99 | 21.73 | 11.26 | NaN | 6468.0 | 11356.0 | 0.303 | 50.93 | 65.032 | 38.7 | 14.864 | 10.244 | 9745.079 | 4.2 | 496.218 | 7.11 | 5.300 | 55.500 | NaN | 2.600 | 73.77 | 0.812 | 3.744385e+06 | NaN | NaN | NaN | NaN |
| 12727 | ARM | Asia | Armenia | 2022-02-06 | 389957.0 | 2467.0 | 3360.571 | 8086.0 | 5.0 | 5.714 | 140248.490 | 887.259 | 1208.633 | 2908.139 | 1.798 | 2.055 | 1.27 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2784379.0 | 4803.0 | 997.637 | 1.721 | 8043.0 | 2.882 | 0.4230 | 2.4 | tests performed | 1.925556e+06 | 1.054178e+06 | 8.589810e+05 | 12397.0 | NaN | 6020.0 | 69.25 | 37.91 | 30.89 | 0.45 | 2165.0 | 3133.0 | 0.113 | NaN | 102.931 | 35.7 | 11.232 | 7.571 | 8787.580 | 1.8 | 341.010 | 7.11 | 1.500 | 52.100 | 94.043 | 4.200 | 75.09 | 0.776 | 2.780472e+06 | NaN | NaN | NaN | NaN |
| 11129 | ARG | South America | Argentina | 2020-12-31 | 1674319.0 | 6969.0 | 8183.000 | 48271.0 | 113.0 | 150.000 | 36789.872 | 153.130 | 179.805 | 1060.660 | 2.483 | 3.296 | 1.21 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 5126379.0 | 34858.0 | 113.223 | 0.770 | 34273.0 | 0.757 | 0.2170 | 4.6 | tests performed | 4.340000e+04 | 4.339200e+04 | 7.000000e+00 | 1.0 | 2806.0 | 11454.0 | 0.10 | 0.10 | 0.00 | 0.00 | 252.0 | 11451.0 | 0.025 | 79.17 | 16.177 | 31.9 | 11.198 | 7.441 | 18933.907 | 0.6 | 191.032 | 5.50 | 16.200 | 27.700 | NaN | 5.000 | 76.67 | 0.845 | 4.551032e+07 | 36108.2000 | 10.57 | 19.65 | 801.76245 |
| 114051 | HTI | North America | Haiti | 2021-05-30 | 15045.0 | 114.0 | 131.000 | 321.0 | 0.0 | 2.714 | 1298.662 | 9.840 | 11.308 | 27.708 | 0.000 | 0.234 | 1.27 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 50.93 | 398.448 | 24.3 | 4.800 | 2.954 | 1653.173 | 23.5 | 430.548 | 6.65 | 2.900 | 23.100 | 22.863 | 0.700 | 64.00 | 0.510 | 1.158500e+07 | NaN | NaN | NaN | NaN |
| 52250 | CHL | South America | Chile | 2022-04-02 | 3476914.0 | 5978.0 | 6000.857 | 56637.0 | 57.0 | 59.286 | 177359.764 | 304.942 | 306.108 | 2889.092 | 2.908 | 3.024 | 0.62 | 546.0 | 27.852 | NaN | NaN | 128.0 | 6.529 | 668.0 | 34.075 | 35407400.0 | 71064.0 | 1816.399 | 3.646 | 67343.0 | 3.455 | 0.0831 | 12.0 | tests performed | 5.087079e+07 | 1.788439e+07 | 1.740085e+07 | 16160143.0 | 8895.0 | 42747.0 | 259.50 | 91.23 | 88.76 | 82.43 | 2181.0 | 1719.0 | 0.009 | 28.78 | 24.282 | 35.4 | 11.087 | 6.938 | 22767.037 | 1.3 | 127.993 | 8.46 | 34.200 | 41.500 | NaN | 2.110 | 80.18 | 0.851 | 1.960374e+07 | NaN | NaN | NaN | NaN |
| 217718 | PRI | North America | Puerto Rico | 2023-02-12 | 1090881.0 | 799.0 | 609.857 | 5750.0 | 10.0 | 5.429 | 335406.769 | 245.664 | 187.509 | 1767.919 | 3.075 | 1.669 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 376.232 | 38.2 | 15.168 | 9.829 | 35044.670 | NaN | 108.094 | 12.90 | NaN | NaN | NaN | NaN | 80.10 | NaN | 3.252412e+06 | 10050.1980 | 10.59 | 26.10 | 3082.58500 |
| 7194 | AGO | Africa | Angola | 2020-01-21 | NaN | 0.0 | 0.000 | NaN | 0.0 | 0.000 | NaN | 0.000 | 0.000 | NaN | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 23.890 | 16.8 | 2.405 | 1.362 | 5819.495 | NaN | 276.045 | 3.94 | NaN | NaN | 26.664 | NaN | 61.15 | 0.581 | 3.558900e+07 | NaN | NaN | NaN | NaN |
| 215875 | PRT | Europe | Portugal | 2021-05-06 | 824281.0 | 553.0 | 527.286 | 16974.0 | 2.0 | 1.429 | 80254.355 | 53.842 | 51.338 | 1652.637 | 0.195 | 0.139 | 0.95 | 77.0 | 7.497 | 283.0 | 27.554 | NaN | NaN | NaN | NaN | 10839340.0 | 48583.0 | 1053.375 | 4.721 | 45025.0 | 4.376 | 0.0077 | 129.1 | tests performed | NaN | NaN | NaN | NaN | NaN | 82363.0 | NaN | NaN | NaN | NaN | 8019.0 | 54695.0 | 0.533 | 72.22 | 112.371 | 46.2 | 21.502 | 14.924 | 27936.896 | 0.5 | 127.842 | 9.85 | 16.300 | 30.000 | NaN | 3.390 | 82.05 | 0.864 | 1.027086e+07 | NaN | NaN | NaN | NaN |
| 161470 | MLI | Africa | Mali | 2021-06-18 | 14364.0 | 5.0 | 5.000 | 523.0 | 0.0 | 0.286 | 635.755 | 0.221 | 0.221 | 23.148 | 0.000 | 0.013 | 0.95 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1873.0 | NaN | NaN | NaN | NaN | 83.0 | 1550.0 | 0.007 | 44.44 | 15.196 | 16.4 | 2.519 | 1.486 | 2014.306 | NaN | 268.024 | 2.42 | 1.600 | 23.000 | 52.232 | 0.100 | 59.31 | 0.434 | 2.259360e+07 | NaN | NaN | NaN | NaN |
| 183801 | NPL | Asia | Nepal | 2020-05-19 | 402.0 | 27.0 | 26.429 | 2.0 | 0.0 | 0.286 | 13.160 | 0.884 | 0.865 | 0.065 | 0.000 | 0.009 | 1.66 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 33006.0 | 2282.0 | 1.099 | 0.076 | 2006.0 | 0.067 | 0.0130 | 76.9 | samples tested | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 92.59 | 204.430 | 25.0 | 5.809 | 3.212 | 2442.804 | 15.0 | 260.797 | 7.26 | 9.500 | 37.800 | 47.782 | 0.300 | 70.78 | 0.602 | 3.054759e+07 | NaN | NaN | NaN | NaN |
| 99607 | GHA | Africa | Ghana | 2021-02-27 | 84023.0 | 409.0 | 466.286 | 607.0 | 11.0 | 3.571 | 2509.957 | 12.218 | 13.929 | 18.132 | 0.329 | 0.107 | 1.02 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 906827.0 | 4014.0 | 27.619 | 0.122 | 4284.0 | 0.130 | 0.0977 | 10.2 | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 44.44 | 126.719 | 21.1 | 3.385 | 1.948 | 4227.630 | 12.0 | 298.245 | 4.97 | 0.300 | 7.700 | 41.047 | 0.900 | 64.07 | 0.611 | 3.347587e+07 | NaN | NaN | NaN | NaN |
| 290560 | UZB | Asia | Uzbekistan | 2023-01-16 | 250261.0 | 17.0 | 31.714 | 1637.0 | 0.0 | 0.000 | 7227.202 | 0.491 | 0.916 | 47.274 | 0.000 | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 22772.0 | NaN | NaN | NaN | NaN | 658.0 | 1105.0 | 0.003 | NaN | 76.134 | 28.2 | 4.469 | 2.873 | 6253.104 | NaN | 724.417 | 7.57 | 1.300 | 24.700 | NaN | 4.000 | 71.72 | 0.720 | 3.462765e+07 | NaN | NaN | NaN | NaN |
| 247062 | SVN | Europe | Slovenia | 2021-10-17 | 308254.0 | 632.0 | 954.143 | 5031.0 | 4.0 | 4.857 | 145413.599 | 298.135 | 450.101 | 2373.289 | 1.887 | 2.291 | 1.27 | 121.0 | 57.080 | 411.0 | 193.882 | 56.0 | 26.576 | 260.0 | 122.439 | 1651831.0 | 1357.0 | 779.382 | 0.640 | 4305.0 | 2.031 | 0.2270 | 4.4 | tests performed | 2.167418e+06 | 1.178841e+06 | 1.097277e+06 | 31999.0 | 39.0 | 4572.0 | 102.24 | 55.61 | 51.76 | 1.51 | 2157.0 | 673.0 | 0.032 | 42.78 | 102.619 | 44.5 | 19.062 | 12.930 | 31400.840 | NaN | 153.493 | 7.25 | 20.100 | 25.000 | NaN | 4.500 | 81.32 | 0.917 | 2.119843e+06 | 3769.6003 | 9.94 | 13.22 | 1778.61000 |
| 136719 | KEN | Africa | Kenya | 2021-05-05 | 160904.0 | 345.0 | 487.429 | 2805.0 | 24.0 | 20.000 | 2978.188 | 6.386 | 9.022 | 51.918 | 0.444 | 0.370 | 0.79 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3991.0 | 0.075 | 0.1098 | 9.1 | tests performed | NaN | NaN | NaN | NaN | NaN | 5159.0 | NaN | NaN | NaN | NaN | 95.0 | 5159.0 | 0.010 | 74.07 | 87.324 | 20.0 | 2.686 | 1.528 | 2993.028 | 36.8 | 218.637 | 2.92 | 1.200 | 20.400 | 24.651 | 1.400 | 66.70 | 0.601 | 5.402748e+07 | NaN | NaN | NaN | NaN |
| 283332 | ARE | Asia | United Arab Emirates | 2022-11-25 | 1043390.0 | 224.0 | 216.571 | 2348.0 | 0.0 | 0.000 | 110515.279 | 23.726 | 22.939 | 248.699 | 0.000 | 0.000 | 0.85 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 13.89 | 112.442 | 34.0 | 1.144 | 0.526 | 67293.483 | NaN | 317.840 | 17.26 | 1.200 | 37.400 | NaN | 1.200 | 77.97 | 0.890 | 9.441138e+06 | NaN | NaN | NaN | NaN |
| 298468 | OWID_WRL | NaN | World | 2022-01-12 | 314730790.0 | 3580417.0 | 2675469.000 | 5527310.0 | 7971.0 | 6946.714 | 39464.156 | 448.949 | 335.478 | 693.070 | 0.999 | 0.871 | 1.29 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 9.573635e+09 | 4.680114e+09 | 3.999404e+09 | 830002800.0 | 35773408.0 | 34878426.0 | 120.04 | 58.68 | 50.15 | 10.41 | 4373.0 | 9733993.0 | 0.122 | NaN | 58.045 | 30.9 | 8.696 | 5.355 | 15469.207 | 10.0 | 233.070 | 8.51 | 6.434 | 34.635 | 60.130 | 2.705 | 72.58 | 0.737 | 7.975105e+09 | NaN | NaN | NaN | NaN |
| 211755 | PER | South America | Peru | 2023-03-01 | 4485753.0 | 69.0 | 134.714 | 219431.0 | 2.0 | 11.429 | 131741.770 | 2.026 | 3.956 | 6444.454 | 0.059 | 0.336 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8.805009e+07 | 3.036484e+07 | 2.855090e+07 | 29134346.0 | 18820.0 | 20898.0 | 258.59 | 89.18 | 83.85 | 85.56 | 614.0 | 2214.0 | 0.007 | NaN | 25.129 | 29.1 | 7.151 | 4.455 | 12236.706 | 3.5 | 85.755 | 5.95 | 4.800 | NaN | NaN | 1.600 | 76.74 | 0.777 | 3.404959e+07 | NaN | NaN | NaN | NaN |
df.sample(50) gives us whole picture or idea about
data.df.columnsIndex(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases',
'new_cases_smoothed', 'total_deaths', 'new_deaths',
'new_deaths_smoothed', 'total_cases_per_million',
'new_cases_per_million', 'new_cases_smoothed_per_million',
'total_deaths_per_million', 'new_deaths_per_million',
'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients',
'icu_patients_per_million', 'hosp_patients',
'hosp_patients_per_million', 'weekly_icu_admissions',
'weekly_icu_admissions_per_million', 'weekly_hosp_admissions',
'weekly_hosp_admissions_per_million', 'total_tests', 'new_tests',
'total_tests_per_thousand', 'new_tests_per_thousand',
'new_tests_smoothed', 'new_tests_smoothed_per_thousand',
'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations',
'people_vaccinated', 'people_fully_vaccinated', 'total_boosters',
'new_vaccinations', 'new_vaccinations_smoothed',
'total_vaccinations_per_hundred', 'people_vaccinated_per_hundred',
'people_fully_vaccinated_per_hundred', 'total_boosters_per_hundred',
'new_vaccinations_smoothed_per_million',
'new_people_vaccinated_smoothed',
'new_people_vaccinated_smoothed_per_hundred', 'stringency_index',
'population_density', 'median_age', 'aged_65_older', 'aged_70_older',
'gdp_per_capita', 'extreme_poverty', 'cardiovasc_death_rate',
'diabetes_prevalence', 'female_smokers', 'male_smokers',
'handwashing_facilities', 'hospital_beds_per_thousand',
'life_expectancy', 'human_development_index', 'population',
'excess_mortality_cumulative_absolute', 'excess_mortality_cumulative',
'excess_mortality', 'excess_mortality_cumulative_per_million'],
dtype='object')
# Descriptive statistics
df.describe(include='all')| iso_code | continent | location | date | total_cases | new_cases | new_cases_smoothed | total_deaths | new_deaths | new_deaths_smoothed | total_cases_per_million | new_cases_per_million | new_cases_smoothed_per_million | total_deaths_per_million | new_deaths_per_million | new_deaths_smoothed_per_million | reproduction_rate | icu_patients | icu_patients_per_million | hosp_patients | hosp_patients_per_million | weekly_icu_admissions | weekly_icu_admissions_per_million | weekly_hosp_admissions | weekly_hosp_admissions_per_million | total_tests | new_tests | total_tests_per_thousand | new_tests_per_thousand | new_tests_smoothed | new_tests_smoothed_per_thousand | positive_rate | tests_per_case | tests_units | total_vaccinations | people_vaccinated | people_fully_vaccinated | total_boosters | new_vaccinations | new_vaccinations_smoothed | total_vaccinations_per_hundred | people_vaccinated_per_hundred | people_fully_vaccinated_per_hundred | total_boosters_per_hundred | new_vaccinations_smoothed_per_million | new_people_vaccinated_smoothed | new_people_vaccinated_smoothed_per_hundred | stringency_index | population_density | median_age | aged_65_older | aged_70_older | gdp_per_capita | extreme_poverty | cardiovasc_death_rate | diabetes_prevalence | female_smokers | male_smokers | handwashing_facilities | hospital_beds_per_thousand | life_expectancy | human_development_index | population | excess_mortality_cumulative_absolute | excess_mortality_cumulative | excess_mortality | excess_mortality_cumulative_per_million | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 302512 | 288160 | 302512 | 302512 | 2.667710e+05 | 2.940640e+05 | 2.928000e+05 | 2.462140e+05 | 294139.000000 | 292909.000000 | 266771.000000 | 294064.000000 | 292800.000000 | 246214.000000 | 294139.000000 | 292909.000000 | 184817.000000 | 34764.000000 | 34764.000000 | 35138.000000 | 35138.00000 | 9101.000000 | 9101.000000 | 21287.00000 | 21287.000000 | 7.938700e+04 | 7.540300e+04 | 79387.000000 | 75403.000000 | 1.039650e+05 | 103965.000000 | 95927.000000 | 9.434800e+04 | 106788 | 7.356100e+04 | 7.041100e+04 | 6.814900e+04 | 4.232400e+04 | 6.054200e+04 | 1.635360e+05 | 73561.000000 | 70411.000000 | 68149.000000 | 42324.000000 | 163536.000000 | 1.635870e+05 | 163587.000000 | 193194.000000 | 256703.000000 | 238751.000000 | 230391.000000 | 236359.000000 | 233979.000000 | 150700.000000 | 234406.000000 | 246348.000000 | 175815.000000 | 173423.000000 | 114817.000000 | 206911.000000 | 278219.000000 | 227212.000000 | 3.025120e+05 | 1.029500e+04 | 10295.000000 | 10295.000000 | 10295.000000 |
| unique | 255 | 6 | 255 | 1198 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| top | ARG | Africa | Argentina | 2022-04-20 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| freq | 1198 | 68173 | 1198 | 255 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 80099 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| mean | NaN | NaN | NaN | NaN | 5.525632e+06 | 1.100018e+04 | 1.104556e+04 | 7.890977e+04 | 97.976117 | 98.374032 | 84395.533693 | 165.822440 | 166.511964 | 793.821846 | 1.040459 | 1.044517 | 0.911495 | 722.296657 | 17.189731 | 4240.497581 | 143.08819 | 376.183936 | 11.395732 | 4580.91732 | 92.829572 | 2.110457e+07 | 6.728541e+04 | 924.254762 | 3.272466 | 1.421784e+05 | 2.826309 | 0.098163 | 2.403633e+03 | NaN | 3.631105e+08 | 1.624038e+08 | 1.437811e+08 | 8.740611e+07 | 8.565426e+05 | 3.334718e+05 | 113.179679 | 50.438996 | 45.558880 | 32.186728 | 2159.949479 | 1.232478e+05 | 0.087312 | 43.477439 | 412.146751 | 30.510652 | 8.699751 | 5.499055 | 19018.946420 | 13.848086 | 264.274957 | 8.561093 | 10.790064 | 32.909646 | 50.789341 | 3.097013 | 73.718480 | 0.722471 | 1.280398e+08 | 4.727286e+04 | 9.535368 | 12.996518 | 1453.830857 |
| std | NaN | NaN | NaN | NaN | 3.465076e+07 | 1.043446e+05 | 1.016488e+05 | 4.087464e+05 | 606.914602 | 597.602496 | 134636.639039 | 1134.538414 | 642.891130 | 1039.499140 | 4.736573 | 2.947081 | 0.399925 | 2255.245237 | 23.556057 | 10438.204914 | 158.65274 | 546.810269 | 14.262871 | 11486.93403 | 91.369225 | 8.409869e+07 | 2.477340e+05 | 2195.428504 | 9.033843 | 1.138215e+06 | 7.308233 | 0.115978 | 3.344366e+04 | NaN | 1.386860e+09 | 6.183925e+08 | 5.601489e+08 | 3.112422e+08 | 3.430025e+06 | 2.093911e+06 | 83.567228 | 29.963802 | 29.531304 | 29.733811 | 3307.710186 | 8.514408e+05 | 0.188333 | 24.400287 | 1881.833423 | 9.083308 | 6.092702 | 4.134639 | 20012.002263 | 20.091626 | 120.931019 | 4.941349 | 10.779392 | 13.574672 | 31.957428 | 2.548380 | 7.397441 | 0.148991 | 6.594467e+08 | 1.377826e+05 | 13.082029 | 26.634303 | 1830.272458 |
| min | NaN | NaN | NaN | NaN | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | -0.070000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 | 0.000000 | 0.000000 | 0.00000 | 0.000000 | 0.000000e+00 | 1.000000e+00 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000 | 0.000000 | 1.000000e+00 | NaN | 0.000000e+00 | 0.000000e+00 | 1.000000e+00 | 1.000000e+00 | 0.000000e+00 | 0.000000e+00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000e+00 | 0.000000 | 0.000000 | 0.137000 | 15.100000 | 1.144000 | 0.526000 | 661.240000 | 0.100000 | 79.370000 | 0.990000 | 0.100000 | 7.700000 | 1.188000 | 0.100000 | 53.280000 | 0.394000 | 4.700000e+01 | -3.772610e+04 | -44.230000 | -95.920000 | -1984.281600 |
| 25% | NaN | NaN | NaN | NaN | 6.265000e+03 | 0.000000e+00 | 1.286000e+00 | 1.180000e+02 | 0.000000 | 0.000000 | 1889.971500 | 0.000000 | 0.294000 | 47.650000 | 0.000000 | 0.000000 | 0.720000 | 25.000000 | 3.049000 | 259.000000 | 38.85600 | 30.000000 | 2.625000 | 280.00000 | 29.891500 | 3.646540e+05 | 2.244000e+03 | 43.585500 | 0.286000 | 1.486000e+03 | 0.203000 | 0.017000 | 7.100000e+00 | NaN | 1.432613e+06 | 8.311475e+05 | 7.180020e+05 | 2.976735e+05 | 3.264000e+03 | 4.100000e+02 | 33.780000 | 22.960000 | 16.270000 | 3.220000 | 198.000000 | 7.900000e+01 | 0.003000 | 22.770000 | 37.728000 | 22.200000 | 3.526000 | 2.085000 | 3823.194000 | 0.600000 | 175.695000 | 5.350000 | 1.900000 | 22.600000 | 20.859000 | 1.300000 | 69.590000 | 0.602000 | 4.490020e+05 | 2.185000e+01 | 0.420000 | -1.040000 | 15.453504 |
| 50% | NaN | NaN | NaN | NaN | 5.986600e+04 | 1.900000e+01 | 4.000000e+01 | 1.193000e+03 | 0.000000 | 0.286000 | 19249.100000 | 2.808000 | 11.391000 | 324.971000 | 0.000000 | 0.038000 | 0.950000 | 112.000000 | 7.797000 | 858.000000 | 90.40100 | 142.000000 | 6.198000 | 960.00000 | 68.190000 | 2.067330e+06 | 8.783000e+03 | 234.141000 | 0.971000 | 6.570000e+03 | 0.851000 | 0.055000 | 1.750000e+01 | NaN | 1.087405e+07 | 5.337184e+06 | 4.703255e+06 | 3.554642e+06 | 2.856750e+04 | 4.893000e+03 | 111.170000 | 58.110000 | 52.230000 | 28.680000 | 904.000000 | 1.159000e+03 | 0.022000 | 43.215000 | 90.672000 | 29.700000 | 6.378000 | 3.871000 | 12294.876000 | 2.500000 | 245.465000 | 7.200000 | 6.300000 | 33.100000 | 49.839000 | 2.500000 | 75.050000 | 0.740000 | 5.882259e+06 | 4.464499e+03 | 7.750000 | 6.750000 | 881.367860 |
| 75% | NaN | NaN | NaN | NaN | 6.149885e+05 | 5.610000e+02 | 6.530000e+02 | 1.038675e+04 | 6.000000 | 7.000000 | 102486.778500 | 73.163500 | 108.147500 | 1214.197000 | 0.455500 | 0.794000 | 1.140000 | 481.000000 | 21.477750 | 3364.000000 | 188.27175 | 497.000000 | 14.782000 | 4344.00000 | 125.494500 | 1.024845e+07 | 3.722900e+04 | 894.374500 | 2.914000 | 3.220500e+04 | 2.584000 | 0.138100 | 5.460000e+01 | NaN | 7.945056e+07 | 3.930546e+07 | 3.161463e+07 | 2.601720e+07 | 2.325775e+05 | 3.853575e+04 | 181.320000 | 76.380000 | 71.920000 | 55.100000 | 2910.000000 | 1.235000e+04 | 0.093000 | 62.500000 | 222.873000 | 38.700000 | 13.928000 | 8.643000 | 27216.445000 | 21.400000 | 333.436000 | 10.790000 | 19.300000 | 41.300000 | 83.241000 | 4.200000 | 79.460000 | 0.829000 | 2.830170e+07 | 3.174990e+04 | 15.520000 | 18.545000 | 2372.123350 |
| max | NaN | NaN | NaN | NaN | 7.627904e+08 | 7.460100e+06 | 6.410233e+06 | 6.897012e+06 | 20005.000000 | 14578.571000 | 731762.140000 | 228872.025000 | 37241.781000 | 6457.229000 | 603.656000 | 148.641000 | 5.870000 | 28891.000000 | 180.675000 | 154497.000000 | 1526.84600 | 4838.000000 | 224.976000 | 153977.00000 | 708.120000 | 9.214000e+09 | 3.585563e+07 | 32925.826000 | 531.062000 | 1.476998e+07 | 147.603000 | 1.000000 | 1.023632e+06 | NaN | 1.336812e+10 | 5.572334e+09 | 5.127133e+09 | 2.759853e+09 | 4.967301e+07 | 4.369274e+07 | 406.430000 | 129.070000 | 126.890000 | 150.470000 | 117113.000000 | 2.107109e+07 | 11.711000 | 100.000000 | 20546.766000 | 48.200000 | 27.049000 | 18.493000 | 116935.600000 | 77.600000 | 724.417000 | 30.530000 | 44.000000 | 78.100000 | 100.000000 | 13.800000 | 86.750000 | 0.957000 | 7.975105e+09 | 1.282260e+06 | 76.550000 | 377.040000 | 10329.523000 |
# Check for missing values
df.isnull().sum().sort_values(ascending=False) # this will show the number of null values in each column in descending order| 0 | |
|---|---|
| weekly_icu_admissions | 293411 |
| weekly_icu_admissions_per_million | 293411 |
| excess_mortality_cumulative_absolute | 292217 |
| excess_mortality_cumulative_per_million | 292217 |
| excess_mortality_cumulative | 292217 |
| excess_mortality | 292217 |
| weekly_hosp_admissions_per_million | 281225 |
| weekly_hosp_admissions | 281225 |
| icu_patients_per_million | 267748 |
| icu_patients | 267748 |
| hosp_patients_per_million | 267374 |
| hosp_patients | 267374 |
| total_boosters | 260188 |
| total_boosters_per_hundred | 260188 |
| new_vaccinations | 241970 |
| people_fully_vaccinated | 234363 |
| people_fully_vaccinated_per_hundred | 234363 |
| people_vaccinated | 232101 |
| people_vaccinated_per_hundred | 232101 |
| total_vaccinations_per_hundred | 228951 |
| total_vaccinations | 228951 |
| new_tests | 227109 |
| new_tests_per_thousand | 227109 |
| total_tests | 223125 |
| total_tests_per_thousand | 223125 |
| tests_per_case | 208164 |
| positive_rate | 206585 |
| new_tests_smoothed_per_thousand | 198547 |
| new_tests_smoothed | 198547 |
| tests_units | 195724 |
| handwashing_facilities | 187695 |
| extreme_poverty | 151812 |
| new_vaccinations_smoothed_per_million | 138976 |
| new_vaccinations_smoothed | 138976 |
| new_people_vaccinated_smoothed_per_hundred | 138925 |
| new_people_vaccinated_smoothed | 138925 |
| male_smokers | 129089 |
| female_smokers | 126697 |
| reproduction_rate | 117695 |
| stringency_index | 109318 |
| hospital_beds_per_thousand | 95601 |
| human_development_index | 75300 |
| aged_65_older | 72121 |
| gdp_per_capita | 68533 |
| cardiovasc_death_rate | 68106 |
| aged_70_older | 66153 |
| median_age | 63761 |
| total_deaths | 56298 |
| total_deaths_per_million | 56298 |
| diabetes_prevalence | 56164 |
| population_density | 45809 |
| total_cases_per_million | 35741 |
| total_cases | 35741 |
| life_expectancy | 24293 |
| continent | 14352 |
| new_cases_smoothed | 9712 |
| new_cases_smoothed_per_million | 9712 |
| new_deaths_smoothed_per_million | 9603 |
| new_deaths_smoothed | 9603 |
| new_cases | 8448 |
| new_cases_per_million | 8448 |
| new_deaths | 8373 |
| new_deaths_per_million | 8373 |
| population | 0 |
| date | 0 |
| location | 0 |
| iso_code | 0 |
(df.isnull().sum() / len(df) * 100).sort_values(ascending=False) # this will show the percentage of null values in each column| 0 | |
|---|---|
| weekly_icu_admissions | 96.991524 |
| weekly_icu_admissions_per_million | 96.991524 |
| excess_mortality_cumulative_absolute | 96.596829 |
| excess_mortality_cumulative_per_million | 96.596829 |
| excess_mortality_cumulative | 96.596829 |
| excess_mortality | 96.596829 |
| weekly_hosp_admissions_per_million | 92.963254 |
| weekly_hosp_admissions | 92.963254 |
| icu_patients_per_million | 88.508224 |
| icu_patients | 88.508224 |
| hosp_patients_per_million | 88.384593 |
| hosp_patients | 88.384593 |
| total_boosters | 86.009150 |
| total_boosters_per_hundred | 86.009150 |
| new_vaccinations | 79.986910 |
| people_fully_vaccinated | 77.472299 |
| people_fully_vaccinated_per_hundred | 77.472299 |
| people_vaccinated | 76.724560 |
| people_vaccinated_per_hundred | 76.724560 |
| total_vaccinations_per_hundred | 75.683279 |
| total_vaccinations | 75.683279 |
| new_tests | 75.074377 |
| new_tests_per_thousand | 75.074377 |
| total_tests | 73.757405 |
| total_tests_per_thousand | 73.757405 |
| tests_per_case | 68.811816 |
| positive_rate | 68.289853 |
| new_tests_smoothed_per_thousand | 65.632768 |
| new_tests_smoothed | 65.632768 |
| tests_units | 64.699582 |
| handwashing_facilities | 62.045473 |
| extreme_poverty | 50.183794 |
| new_vaccinations_smoothed_per_million | 45.940657 |
| new_vaccinations_smoothed | 45.940657 |
| new_people_vaccinated_smoothed_per_hundred | 45.923798 |
| new_people_vaccinated_smoothed | 45.923798 |
| male_smokers | 42.672357 |
| female_smokers | 41.881644 |
| reproduction_rate | 38.905895 |
| stringency_index | 36.136748 |
| hospital_beds_per_thousand | 31.602383 |
| human_development_index | 24.891575 |
| aged_65_older | 23.840707 |
| gdp_per_capita | 22.654638 |
| cardiovasc_death_rate | 22.513487 |
| aged_70_older | 21.867893 |
| median_age | 21.077180 |
| total_deaths | 18.610171 |
| total_deaths_per_million | 18.610171 |
| diabetes_prevalence | 18.565875 |
| population_density | 15.142870 |
| total_cases_per_million | 11.814738 |
| total_cases | 11.814738 |
| life_expectancy | 8.030425 |
| continent | 4.744275 |
| new_cases_smoothed | 3.210451 |
| new_cases_smoothed_per_million | 3.210451 |
| new_deaths_smoothed_per_million | 3.174420 |
| new_deaths_smoothed | 3.174420 |
| new_cases | 2.792616 |
| new_cases_per_million | 2.792616 |
| new_deaths | 2.767824 |
| new_deaths_per_million | 2.767824 |
| population | 0.000000 |
| date | 0.000000 |
| location | 0.000000 |
| iso_code | 0.000000 |
# Forward fill and Backward fill for time-series or categorical data
df.fillna(method='ffill', inplace=True)
df.fillna(method='bfill', inplace=True)
# Separate numerical and categorical columns
num_cols = df.select_dtypes(include=['number']).columns
cat_cols = df.select_dtypes(include=['object']).columns
# Fill numerical columns with median (in case ffill and bfill didn't work)
df[num_cols] = df[num_cols].apply(lambda x: x.fillna(x.median()))
# Fill categorical columns with mode
df[cat_cols] = df[cat_cols].apply(lambda x: x.fillna(x.mode()[0]))
# Check if all missing values are handled
print("Remaining missing values per column:")
print(df.isnull().sum().sum()) # Should be 0
Remaining missing values per column:
0
# Check for duplicate rows
duplicates = df.duplicated().sum()
print(f"Number of duplicate rows: {duplicates}")Number of duplicate rows: 0
# Look for any columns with invalid data types or strange values
print(df.dtypes)iso_code object
continent object
location object
date object
total_cases float64
new_cases float64
new_cases_smoothed float64
total_deaths float64
new_deaths float64
new_deaths_smoothed float64
total_cases_per_million float64
new_cases_per_million float64
new_cases_smoothed_per_million float64
total_deaths_per_million float64
new_deaths_per_million float64
new_deaths_smoothed_per_million float64
reproduction_rate float64
icu_patients float64
icu_patients_per_million float64
hosp_patients float64
hosp_patients_per_million float64
weekly_icu_admissions float64
weekly_icu_admissions_per_million float64
weekly_hosp_admissions float64
weekly_hosp_admissions_per_million float64
total_tests float64
new_tests float64
total_tests_per_thousand float64
new_tests_per_thousand float64
new_tests_smoothed float64
new_tests_smoothed_per_thousand float64
positive_rate float64
tests_per_case float64
tests_units object
total_vaccinations float64
people_vaccinated float64
people_fully_vaccinated float64
total_boosters float64
new_vaccinations float64
new_vaccinations_smoothed float64
total_vaccinations_per_hundred float64
people_vaccinated_per_hundred float64
people_fully_vaccinated_per_hundred float64
total_boosters_per_hundred float64
new_vaccinations_smoothed_per_million float64
new_people_vaccinated_smoothed float64
new_people_vaccinated_smoothed_per_hundred float64
stringency_index float64
population_density float64
median_age float64
aged_65_older float64
aged_70_older float64
gdp_per_capita float64
extreme_poverty float64
cardiovasc_death_rate float64
diabetes_prevalence float64
female_smokers float64
male_smokers float64
handwashing_facilities float64
hospital_beds_per_thousand float64
life_expectancy float64
human_development_index float64
population float64
excess_mortality_cumulative_absolute float64
excess_mortality_cumulative float64
excess_mortality float64
excess_mortality_cumulative_per_million float64
dtype: object
# Identify categorical columns
cat_cols = df.select_dtypes(include=['object']).columns
# Check for inconsistent categorical values
for col in cat_cols:
print(f"Unique values in '{col}':")
print(df[col].value_counts(dropna=False)) # Includes NaN counts
print("-" * 50) # Separator for readabilityUnique values in 'iso_code':
iso_code
ARG 1198
MEX 1198
AFG 1196
PLW 1196
NIC 1196
NER 1196
NGA 1196
NIU 1196
OWID_NAM 1196
PRK 1196
MKD 1196
MNP 1196
NOR 1196
OWID_OCE 1196
OMN 1196
PAK 1196
PSE 1196
NCL 1196
PAN 1196
PNG 1196
PRY 1196
PER 1196
PHL 1196
PCN 1196
POL 1196
PRT 1196
PRI 1196
QAT 1196
REU 1196
ROU 1196
RUS 1196
NZL 1196
NLD 1196
BLM 1196
MRT 1196
LTU 1196
OWID_AFR 1196
OWID_LMC 1196
LUX 1196
MDG 1196
MWI 1196
MYS 1196
MDV 1196
MLI 1196
MLT 1196
MHL 1196
MTQ 1196
MUS 1196
NPL 1196
MYT 1196
FSM 1196
MDA 1196
MCO 1196
MNG 1196
MNE 1196
MSR 1196
MAR 1196
MOZ 1196
MMR 1196
NAM 1196
NRU 1196
RWA 1196
SHN 1196
LBY 1196
TZA 1196
TLS 1196
TGO 1196
TKL 1196
TON 1196
TTO 1196
TUN 1196
TUR 1196
TKM 1196
TCA 1196
TUV 1196
UGA 1196
UKR 1196
ARE 1196
GBR 1196
USA 1196
VIR 1196
OWID_UMC 1196
URY 1196
UZB 1196
VUT 1196
VAT 1196
VEN 1196
VNM 1196
WLF 1196
OWID_WRL 1196
YEM 1196
ZMB 1196
THA 1196
TJK 1196
KNA 1196
SYR 1196
LCA 1196
MAF 1196
SPM 1196
VCT 1196
WSM 1196
SMR 1196
STP 1196
SAU 1196
SEN 1196
SRB 1196
SYC 1196
SLE 1196
SGP 1196
SXM 1196
SVK 1196
SVN 1196
SLB 1196
SOM 1196
ZAF 1196
OWID_SAM 1196
KOR 1196
SSD 1196
ESP 1196
LKA 1196
SDN 1196
SWE 1196
CHE 1196
LIE 1196
OWID_LIC 1196
LBR 1196
COG 1196
BFA 1196
BDI 1196
KHM 1196
CMR 1196
CAN 1196
CPV 1196
CYM 1196
CAF 1196
TCD 1196
CHL 1196
CHN 1196
COL 1196
COM 1196
COK 1196
SLV 1196
CRI 1196
CIV 1196
HRV 1196
CUB 1196
CUW 1196
CYP 1196
CZE 1196
COD 1196
DNK 1196
DJI 1196
DMA 1196
DOM 1196
LSO 1196
BGR 1196
BRN 1196
VGB 1196
BRA 1196
ALB 1196
DZA 1196
ASM 1196
AND 1196
AGO 1196
AIA 1196
ATG 1196
ARM 1196
ABW 1196
OWID_ASI 1196
AUS 1196
AUT 1196
AZE 1196
BHS 1196
BHR 1196
BGD 1196
BRB 1196
BLR 1196
BEL 1196
BLZ 1196
BEN 1196
BMU 1196
BTN 1196
BOL 1196
BES 1196
BIH 1196
BWA 1196
EGY 1196
ECU 1196
ZWE 1196
IDN 1196
GGY 1196
GIN 1196
GNQ 1196
GNB 1196
GUY 1196
HTI 1196
OWID_HIC 1196
HND 1196
KGZ 1196
HUN 1196
ISL 1196
IND 1196
IRN 1196
KWT 1196
IRQ 1196
IRL 1196
IMN 1196
ISR 1196
ITA 1196
JAM 1196
JPN 1196
JEY 1196
JOR 1196
KAZ 1196
KEN 1196
KIR 1196
GTM 1196
OWID_KOS 1196
GUM 1196
GLP 1196
ERI 1196
EST 1196
SWZ 1196
ETH 1196
OWID_EUR 1196
OWID_EUN 1196
FRO 1196
FLK 1196
FJI 1196
LBN 1196
FIN 1196
FRA 1196
GUF 1196
PYF 1196
GAB 1196
GMB 1196
GEO 1196
DEU 1196
GHA 1196
GIB 1196
LVA 1196
LAO 1196
GRC 1196
GRL 1196
GRD 1196
SUR 1195
TWN 1183
HKG 1165
OWID_NIR 1131
OWID_SCT 1123
OWID_ENG 1112
OWID_WLS 1100
MAC 787
OWID_CYN 691
ESH 1
Name: count, dtype: int64
--------------------------------------------------
Unique values in 'continent':
continent
Africa 72957
Europe 69050
Asia 61234
North America 52626
Oceania 29900
South America 16745
Name: count, dtype: int64
--------------------------------------------------
Unique values in 'location':
location
Argentina 1198
Mexico 1198
Afghanistan 1196
Palau 1196
Nicaragua 1196
Niger 1196
Nigeria 1196
Niue 1196
North America 1196
North Korea 1196
North Macedonia 1196
Northern Mariana Islands 1196
Norway 1196
Oceania 1196
Oman 1196
Pakistan 1196
Palestine 1196
New Caledonia 1196
Panama 1196
Papua New Guinea 1196
Paraguay 1196
Peru 1196
Philippines 1196
Pitcairn 1196
Poland 1196
Portugal 1196
Puerto Rico 1196
Qatar 1196
Reunion 1196
Romania 1196
Russia 1196
New Zealand 1196
Netherlands 1196
Saint Barthelemy 1196
Mauritania 1196
Lithuania 1196
Africa 1196
Lower middle income 1196
Luxembourg 1196
Madagascar 1196
Malawi 1196
Malaysia 1196
Maldives 1196
Mali 1196
Malta 1196
Marshall Islands 1196
Martinique 1196
Mauritius 1196
Nepal 1196
Mayotte 1196
Micronesia (country) 1196
Moldova 1196
Monaco 1196
Mongolia 1196
Montenegro 1196
Montserrat 1196
Morocco 1196
Mozambique 1196
Myanmar 1196
Namibia 1196
Nauru 1196
Rwanda 1196
Saint Helena 1196
Libya 1196
Tanzania 1196
Timor 1196
Togo 1196
Tokelau 1196
Tonga 1196
Trinidad and Tobago 1196
Tunisia 1196
Turkey 1196
Turkmenistan 1196
Turks and Caicos Islands 1196
Tuvalu 1196
Uganda 1196
Ukraine 1196
United Arab Emirates 1196
United Kingdom 1196
United States 1196
United States Virgin Islands 1196
Upper middle income 1196
Uruguay 1196
Uzbekistan 1196
Vanuatu 1196
Vatican 1196
Venezuela 1196
Vietnam 1196
Wallis and Futuna 1196
World 1196
Yemen 1196
Zambia 1196
Thailand 1196
Tajikistan 1196
Saint Kitts and Nevis 1196
Syria 1196
Saint Lucia 1196
Saint Martin (French part) 1196
Saint Pierre and Miquelon 1196
Saint Vincent and the Grenadines 1196
Samoa 1196
San Marino 1196
Sao Tome and Principe 1196
Saudi Arabia 1196
Senegal 1196
Serbia 1196
Seychelles 1196
Sierra Leone 1196
Singapore 1196
Sint Maarten (Dutch part) 1196
Slovakia 1196
Slovenia 1196
Solomon Islands 1196
Somalia 1196
South Africa 1196
South America 1196
South Korea 1196
South Sudan 1196
Spain 1196
Sri Lanka 1196
Sudan 1196
Sweden 1196
Switzerland 1196
Liechtenstein 1196
Low income 1196
Liberia 1196
Congo 1196
Burkina Faso 1196
Burundi 1196
Cambodia 1196
Cameroon 1196
Canada 1196
Cape Verde 1196
Cayman Islands 1196
Central African Republic 1196
Chad 1196
Chile 1196
China 1196
Colombia 1196
Comoros 1196
Cook Islands 1196
El Salvador 1196
Costa Rica 1196
Cote d'Ivoire 1196
Croatia 1196
Cuba 1196
Curacao 1196
Cyprus 1196
Czechia 1196
Democratic Republic of Congo 1196
Denmark 1196
Djibouti 1196
Dominica 1196
Dominican Republic 1196
Lesotho 1196
Bulgaria 1196
Brunei 1196
British Virgin Islands 1196
Brazil 1196
Albania 1196
Algeria 1196
American Samoa 1196
Andorra 1196
Angola 1196
Anguilla 1196
Antigua and Barbuda 1196
Armenia 1196
Aruba 1196
Asia 1196
Australia 1196
Austria 1196
Azerbaijan 1196
Bahamas 1196
Bahrain 1196
Bangladesh 1196
Barbados 1196
Belarus 1196
Belgium 1196
Belize 1196
Benin 1196
Bermuda 1196
Bhutan 1196
Bolivia 1196
Bonaire Sint Eustatius and Saba 1196
Bosnia and Herzegovina 1196
Botswana 1196
Egypt 1196
Ecuador 1196
Zimbabwe 1196
Indonesia 1196
Guernsey 1196
Guinea 1196
Equatorial Guinea 1196
Guinea-Bissau 1196
Guyana 1196
Haiti 1196
High income 1196
Honduras 1196
Kyrgyzstan 1196
Hungary 1196
Iceland 1196
India 1196
Iran 1196
Kuwait 1196
Iraq 1196
Ireland 1196
Isle of Man 1196
Israel 1196
Italy 1196
Jamaica 1196
Japan 1196
Jersey 1196
Jordan 1196
Kazakhstan 1196
Kenya 1196
Kiribati 1196
Guatemala 1196
Kosovo 1196
Guam 1196
Guadeloupe 1196
Eritrea 1196
Estonia 1196
Eswatini 1196
Ethiopia 1196
Europe 1196
European Union 1196
Faeroe Islands 1196
Falkland Islands 1196
Fiji 1196
Lebanon 1196
Finland 1196
France 1196
French Guiana 1196
French Polynesia 1196
Gabon 1196
Gambia 1196
Georgia 1196
Germany 1196
Ghana 1196
Gibraltar 1196
Latvia 1196
Laos 1196
Greece 1196
Greenland 1196
Grenada 1196
Suriname 1195
Taiwan 1183
Hong Kong 1165
Northern Ireland 1131
Scotland 1123
England 1112
Wales 1100
Macao 787
Northern Cyprus 691
Western Sahara 1
Name: count, dtype: int64
--------------------------------------------------
Unique values in 'date':
date
2022-04-20 255
2021-08-24 254
2021-12-22 254
2021-12-09 254
2021-12-10 254
2021-12-11 254
2021-12-12 254
2021-12-13 254
2021-12-14 254
2021-12-15 254
2021-12-16 254
2021-12-17 254
2021-12-18 254
2021-12-19 254
2021-12-20 254
2021-12-21 254
2021-12-23 254
2021-12-07 254
2021-12-24 254
2021-12-25 254
2021-12-26 254
2021-12-27 254
2021-12-28 254
2021-12-29 254
2021-12-30 254
2021-12-31 254
2022-01-01 254
2022-01-02 254
2022-01-03 254
2022-01-04 254
2022-01-05 254
2022-01-06 254
2021-12-08 254
2021-12-06 254
2022-01-08 254
2021-11-19 254
2021-11-05 254
2021-11-06 254
2021-11-07 254
2021-11-08 254
2021-11-09 254
2021-11-10 254
2021-11-11 254
2021-11-12 254
2021-11-13 254
2021-11-14 254
2021-11-15 254
2021-11-16 254
2021-11-17 254
2021-11-18 254
2021-11-20 254
2021-12-05 254
2021-11-21 254
2021-11-22 254
2021-11-23 254
2021-11-24 254
2021-11-25 254
2021-11-26 254
2021-11-27 254
2021-11-28 254
2021-11-29 254
2021-11-30 254
2021-12-01 254
2021-12-02 254
2021-12-03 254
2021-12-04 254
2022-01-07 254
2022-01-09 254
2021-11-03 254
2022-02-11 254
2022-02-13 254
2022-02-14 254
2022-02-15 254
2022-02-16 254
2022-02-17 254
2022-02-18 254
2022-02-19 254
2022-02-20 254
2022-02-21 254
2022-02-22 254
2022-02-23 254
2022-02-24 254
2022-02-25 254
2022-02-26 254
2022-02-27 254
2022-02-28 254
2022-03-01 254
2022-03-02 254
2022-03-03 254
2022-03-04 254
2022-03-05 254
2022-03-06 254
2022-03-07 254
2022-03-08 254
2022-03-09 254
2022-03-10 254
2022-03-11 254
2022-03-12 254
2022-03-13 254
2022-02-12 254
2022-02-10 254
2022-01-10 254
2022-02-09 254
2022-01-11 254
2022-01-12 254
2022-01-13 254
2022-01-14 254
2022-01-15 254
2022-01-16 254
2022-01-17 254
2022-01-18 254
2022-01-19 254
2022-01-20 254
2022-01-21 254
2022-01-22 254
2022-01-23 254
2022-01-24 254
2022-01-25 254
2022-01-26 254
2022-01-27 254
2022-01-28 254
2022-01-29 254
2022-01-30 254
2022-01-31 254
2022-02-01 254
2022-02-02 254
2022-02-03 254
2022-02-04 254
2022-02-05 254
2022-02-06 254
2022-02-07 254
2022-02-08 254
2021-11-04 254
2021-11-02 254
2022-03-15 254
2021-08-08 254
2021-07-25 254
2021-07-26 254
2021-07-27 254
2021-07-28 254
2021-07-29 254
2021-07-30 254
2021-07-31 254
2021-08-01 254
2021-08-02 254
2021-08-03 254
2021-08-04 254
2021-08-05 254
2021-08-06 254
2021-08-07 254
2021-08-09 254
2021-08-27 254
2021-08-10 254
2021-08-11 254
2021-08-12 254
2021-08-13 254
2021-08-14 254
2021-08-15 254
2021-08-17 254
2021-08-18 254
2021-08-19 254
2021-08-20 254
2021-08-21 254
2021-08-22 254
2021-08-23 254
2021-08-25 254
2021-07-24 254
2021-07-23 254
2021-07-22 254
2021-07-21 254
2021-06-22 254
2021-06-23 254
2021-06-24 254
2021-06-25 254
2021-06-26 254
2021-06-27 254
2021-06-28 254
2021-06-29 254
2021-06-30 254
2021-07-01 254
2021-07-02 254
2021-07-03 254
2021-07-04 254
2021-07-05 254
2021-07-06 254
2021-07-07 254
2021-07-08 254
2021-07-09 254
2021-07-10 254
2021-07-11 254
2021-07-12 254
2021-07-13 254
2021-07-14 254
2021-07-15 254
2021-07-16 254
2021-07-17 254
2021-07-18 254
2021-07-19 254
2021-07-20 254
2021-08-26 254
2021-08-28 254
2021-11-01 254
2021-10-16 254
2021-10-02 254
2021-10-03 254
2021-10-04 254
2021-10-05 254
2021-10-06 254
2021-10-07 254
2021-10-08 254
2021-10-09 254
2021-10-10 254
2021-10-11 254
2021-10-12 254
2021-10-13 254
2021-10-14 254
2021-10-15 254
2021-10-17 254
2021-08-29 254
2021-10-18 254
2021-10-19 254
2021-10-20 254
2021-10-21 254
2021-10-22 254
2021-10-23 254
2021-10-24 254
2021-10-25 254
2021-10-26 254
2021-10-27 254
2021-10-28 254
2021-10-29 254
2021-10-30 254
2021-10-31 254
2021-10-01 254
2021-09-30 254
2021-09-29 254
2021-09-28 254
2021-08-30 254
2021-08-31 254
2021-09-01 254
2021-09-02 254
2021-09-03 254
2021-09-04 254
2021-09-05 254
2021-09-06 254
2021-09-07 254
2021-09-08 254
2021-09-09 254
2021-09-10 254
2021-09-11 254
2021-09-12 254
2021-09-13 254
2021-09-14 254
2021-09-15 254
2021-09-16 254
2021-09-17 254
2021-09-18 254
2021-09-19 254
2021-09-20 254
2021-09-21 254
2021-09-22 254
2021-09-23 254
2021-09-24 254
2021-09-25 254
2021-09-26 254
2021-09-27 254
2022-03-14 254
2022-03-16 254
2021-06-20 254
2022-09-15 254
2022-09-01 254
2022-09-02 254
2022-09-03 254
2022-09-04 254
2022-09-05 254
2022-09-06 254
2022-09-07 254
2022-09-08 254
2022-09-09 254
2022-09-10 254
2022-09-11 254
2022-09-12 254
2022-09-13 254
2022-09-14 254
2022-09-16 254
2022-07-28 254
2022-09-17 254
2022-09-18 254
2022-09-19 254
2022-09-20 254
2022-09-21 254
2022-09-22 254
2022-09-23 254
2022-09-24 254
2022-09-25 254
2022-09-26 254
2022-09-27 254
2022-09-28 254
2022-09-29 254
2022-09-30 254
2022-08-31 254
2022-08-30 254
2022-08-29 254
2022-08-28 254
2022-07-30 254
2022-07-31 254
2022-08-01 254
2022-08-02 254
2022-08-03 254
2022-08-04 254
2022-08-05 254
2022-08-06 254
2022-08-07 254
2022-08-08 254
2022-08-09 254
2022-08-10 254
2022-08-11 254
2022-08-12 254
2022-08-13 254
2022-08-14 254
2022-08-15 254
2022-08-16 254
2022-08-17 254
2022-08-18 254
2022-08-19 254
2022-08-20 254
2022-08-21 254
2022-08-22 254
2022-08-23 254
2022-08-24 254
2022-08-25 254
2022-08-26 254
2022-08-27 254
2022-10-01 254
2022-10-02 254
2022-10-03 254
2022-11-05 254
2022-11-07 254
2022-11-08 254
2022-11-09 254
2022-11-10 254
2022-11-11 254
2022-11-12 254
2022-11-13 254
2022-11-14 254
2022-11-15 254
2022-11-16 254
2022-11-17 254
2022-11-18 254
2022-11-19 254
2022-11-20 254
2022-11-21 254
2022-11-22 254
2022-11-23 254
2022-11-24 254
2022-11-25 254
2022-11-26 254
2022-11-27 254
2022-11-28 254
2022-11-29 254
2022-11-30 254
2022-12-01 254
2022-12-02 254
2022-12-03 254
2022-12-04 254
2022-12-06 254
2022-11-06 254
2022-11-04 254
2022-10-04 254
2022-11-03 254
2022-10-05 254
2022-10-06 254
2022-10-07 254
2022-10-08 254
2022-10-09 254
2022-10-10 254
2022-10-11 254
2022-10-12 254
2022-10-13 254
2022-10-14 254
2022-10-15 254
2022-10-16 254
2022-10-17 254
2022-10-18 254
2022-10-19 254
2022-10-20 254
2022-10-21 254
2022-10-22 254
2022-10-23 254
2022-10-24 254
2022-10-25 254
2022-10-26 254
2022-10-27 254
2022-10-28 254
2022-10-29 254
2022-10-30 254
2022-10-31 254
2022-11-01 254
2022-11-02 254
2022-07-29 254
2022-07-27 254
2022-03-17 254
2022-05-05 254
2022-04-21 254
2022-04-22 254
2022-04-23 254
2022-04-24 254
2022-04-25 254
2022-04-26 254
2022-04-27 254
2022-04-28 254
2022-04-29 254
2022-04-30 254
2022-05-01 254
2022-05-02 254
2022-05-03 254
2022-05-04 254
2022-05-06 254
2022-07-26 254
2022-05-07 254
2022-05-08 254
2022-05-09 254
2022-05-10 254
2022-05-11 254
2022-05-12 254
2022-05-13 254
2022-05-14 254
2022-05-15 254
2022-05-16 254
2022-05-17 254
2022-05-18 254
2022-05-19 254
2022-05-20 254
2022-04-19 254
2022-04-18 254
2022-04-17 254
2022-04-16 254
2022-03-18 254
2022-03-19 254
2022-03-20 254
2022-03-21 254
2022-03-22 254
2022-03-23 254
2022-03-24 254
2022-03-25 254
2022-03-26 254
2022-03-27 254
2022-03-28 254
2022-03-29 254
2022-03-30 254
2022-03-31 254
2022-04-01 254
2022-04-02 254
2022-04-03 254
2022-04-04 254
2022-04-05 254
2022-04-06 254
2022-04-07 254
2022-04-08 254
2022-04-09 254
2022-04-10 254
2022-04-11 254
2022-04-12 254
2022-04-13 254
2022-04-14 254
2022-04-15 254
2022-05-21 254
2022-05-22 254
2022-05-23 254
2022-06-25 254
2022-06-27 254
2022-06-28 254
2022-06-29 254
2022-06-30 254
2022-07-01 254
2022-07-02 254
2022-07-03 254
2022-07-04 254
2022-07-05 254
2022-07-06 254
2022-07-07 254
2022-07-08 254
2022-07-09 254
2022-07-10 254
2022-07-11 254
2022-07-12 254
2022-07-13 254
2022-07-14 254
2022-07-15 254
2022-07-16 254
2022-07-17 254
2022-07-18 254
2022-07-19 254
2022-07-20 254
2022-07-21 254
2022-07-22 254
2022-07-23 254
2022-07-24 254
2022-07-25 254
2022-06-26 254
2022-06-24 254
2022-05-24 254
2022-06-23 254
2022-05-25 254
2022-05-26 254
2022-05-27 254
2022-05-28 254
2022-05-29 254
2022-05-30 254
2022-05-31 254
2022-06-01 254
2022-06-02 254
2022-06-03 254
2022-06-04 254
2022-06-05 254
2022-06-06 254
2022-06-07 254
2022-06-08 254
2022-06-09 254
2022-06-10 254
2022-06-11 254
2022-06-12 254
2022-06-13 254
2022-06-14 254
2022-06-15 254
2022-06-16 254
2022-06-17 254
2022-06-18 254
2022-06-19 254
2022-06-20 254
2022-06-21 254
2022-06-22 254
2021-06-21 254
2021-08-16 254
2021-06-19 254
2021-03-28 254
2021-03-14 254
2021-03-15 254
2021-03-16 254
2021-03-17 254
2021-03-18 254
2021-03-19 254
2021-03-20 254
2021-03-21 254
2021-03-22 254
2021-03-23 254
2021-03-24 254
2021-03-25 254
2021-03-26 254
2021-03-27 254
2021-03-29 254
2021-03-12 254
2021-03-30 254
2021-03-31 254
2021-04-01 254
2021-04-02 254
2021-04-03 254
2021-04-04 254
2021-04-05 254
2021-04-06 254
2021-04-07 254
2021-04-08 254
2021-04-09 254
2021-04-10 254
2021-04-11 254
2021-04-12 254
2021-03-13 254
2021-03-11 254
2021-04-14 254
2021-02-22 254
2021-02-08 254
2021-02-09 254
2021-02-10 254
2021-02-11 254
2021-02-12 254
2021-02-13 254
2021-02-14 254
2021-02-15 254
2021-02-16 254
2021-06-18 254
2021-02-18 254
2021-02-19 254
2021-02-20 254
2021-02-21 254
2021-02-23 254
2021-03-10 254
2021-02-24 254
2021-02-25 254
2021-02-26 254
2021-02-27 254
2021-02-28 254
2021-03-01 254
2021-03-02 254
2021-03-03 254
2021-03-04 254
2021-03-05 254
2021-03-06 254
2021-03-07 254
2021-03-08 254
2021-03-09 254
2021-04-13 254
2021-02-17 254
2021-04-15 254
2021-06-02 254
2021-05-19 254
2021-05-20 254
2021-05-21 254
2021-05-22 254
2021-05-23 254
2021-05-24 254
2021-05-25 254
2021-05-26 254
2021-05-27 254
2021-05-28 254
2021-05-29 254
2021-05-30 254
2021-05-31 254
2021-06-01 254
2021-06-03 254
2021-05-17 254
2021-06-04 254
2021-06-05 254
2021-06-06 254
2021-06-07 254
2021-06-08 254
2021-06-09 254
2021-06-10 254
2021-06-11 254
2021-06-13 254
2021-06-14 254
2021-06-15 254
2021-06-16 254
2021-06-17 254
2021-04-16 254
2021-05-18 254
2021-06-12 254
2021-05-16 254
2021-04-30 254
2021-05-15 254
2021-04-17 254
2021-04-18 254
2021-04-20 254
2021-04-21 254
2021-04-22 254
2021-04-23 254
2021-04-24 254
2021-04-25 254
2021-04-26 254
2021-04-27 254
2021-04-28 254
2021-04-29 254
2021-04-19 254
2021-05-01 254
2021-05-08 254
2021-05-13 254
2021-05-12 254
2021-05-14 254
2021-05-11 254
2021-05-10 254
2021-05-09 254
2021-05-07 254
2021-05-06 254
2021-05-05 254
2021-05-04 254
2021-05-03 254
2021-05-02 254
2023-01-14 253
2023-01-10 253
2023-01-13 253
2023-01-12 253
2023-01-11 253
2023-01-05 253
2023-01-09 253
2023-01-08 253
2023-01-07 253
2023-01-06 253
2023-01-04 253
2023-01-03 253
2023-01-15 253
2023-01-21 253
2023-01-16 253
2023-01-17 253
2023-01-18 253
2023-01-19 253
2023-01-20 253
2023-01-22 253
2023-01-23 253
2023-01-24 253
2023-01-25 253
2023-01-26 253
2023-01-01 253
2023-01-27 253
2023-01-28 253
2023-01-02 253
2022-12-18 253
2022-12-31 253
2023-04-03 253
2022-12-05 253
2023-01-30 253
2022-12-07 253
2022-12-08 253
2022-12-09 253
2022-12-10 253
2022-12-11 253
2022-12-12 253
2022-12-13 253
2022-12-14 253
2022-12-15 253
2022-12-16 253
2022-12-17 253
2022-12-19 253
2022-12-20 253
2022-12-21 253
2022-12-22 253
2022-12-23 253
2022-12-24 253
2022-12-25 253
2022-12-26 253
2022-12-27 253
2022-12-28 253
2022-12-29 253
2022-12-30 253
2023-01-29 253
2023-03-03 253
2023-01-31 253
2023-03-16 253
2023-03-04 253
2023-03-05 253
2023-03-06 253
2023-03-07 253
2023-03-08 253
2023-03-09 253
2023-03-10 253
2023-03-11 253
2023-03-12 253
2023-03-13 253
2023-03-14 253
2023-03-15 253
2023-03-17 253
2023-02-01 253
2023-03-18 253
2023-03-19 253
2023-03-20 253
2023-03-21 253
2023-03-22 253
2023-03-23 253
2023-03-24 253
2023-03-25 253
2023-03-26 253
2023-03-27 253
2023-03-29 253
2023-03-28 253
2023-03-02 253
2023-03-01 253
2023-02-28 253
2023-02-27 253
2023-02-02 253
2023-02-03 253
2023-02-04 253
2023-02-05 253
2023-02-06 253
2023-02-07 253
2023-02-08 253
2023-02-09 253
2023-02-10 253
2023-02-11 253
2023-02-12 253
2023-02-13 253
2023-02-14 253
2023-02-15 253
2023-02-16 253
2023-02-17 253
2023-02-18 253
2023-02-19 253
2023-02-20 253
2023-02-21 253
2023-02-22 253
2023-02-23 253
2023-02-24 253
2023-02-25 253
2023-02-26 253
2023-04-02 253
2023-03-30 253
2021-01-21 253
2021-01-28 253
2021-01-14 253
2021-01-15 253
2021-01-16 253
2021-01-17 253
2021-01-18 253
2021-01-19 253
2021-01-20 253
2021-01-22 253
2021-01-23 253
2021-01-24 253
2021-01-26 253
2021-01-27 253
2021-01-25 253
2021-01-29 253
2021-02-04 253
2023-03-31 253
2023-04-01 253
2021-02-07 253
2021-02-06 253
2021-01-30 253
2021-02-05 253
2021-02-03 253
2021-02-02 253
2021-02-01 253
2021-01-31 253
2020-10-22 252
2020-10-18 252
2020-10-21 252
2020-10-20 252
2020-10-19 252
2023-04-05 252
2023-04-04 252
2020-10-15 252
2020-10-17 252
2020-10-14 252
2020-04-01 252
2020-04-02 252
2020-04-03 252
2020-04-04 252
2020-10-24 252
2020-10-23 252
2020-11-07 252
2020-10-25 252
2020-10-26 252
2020-10-27 252
2020-10-28 252
2020-10-29 252
2020-10-30 252
2020-10-31 252
2020-11-01 252
2020-11-02 252
2020-11-03 252
2020-11-04 252
2020-11-05 252
2020-11-06 252
2020-04-06 252
2020-11-08 252
2020-04-05 252
2020-04-20 252
2020-04-07 252
2020-04-08 252
2020-05-07 252
2020-05-06 252
2020-05-05 252
2020-05-04 252
2020-05-03 252
2020-05-02 252
2020-05-01 252
2020-04-30 252
2020-04-29 252
2020-04-28 252
2020-04-27 252
2020-04-26 252
2020-04-25 252
2020-04-24 252
2020-04-23 252
2020-04-22 252
2020-04-21 252
2020-11-10 252
2020-04-19 252
2020-04-18 252
2020-04-17 252
2020-04-16 252
2020-04-15 252
2020-04-14 252
2020-04-13 252
2020-04-12 252
2020-04-11 252
2020-04-10 252
2020-04-09 252
2020-11-09 252
2020-11-24 252
2020-11-11 252
2020-12-14 252
2020-12-16 252
2020-12-17 252
2020-12-18 252
2020-12-19 252
2020-12-20 252
2020-12-21 252
2020-12-22 252
2020-12-23 252
2020-12-24 252
2020-12-25 252
2020-12-26 252
2020-12-27 252
2020-12-28 252
2020-12-29 252
2020-12-30 252
2020-12-31 252
2021-01-01 252
2021-01-02 252
2021-01-03 252
2021-01-04 252
2021-01-05 252
2021-01-06 252
2021-01-07 252
2021-01-08 252
2021-01-09 252
2021-01-10 252
2021-01-11 252
2021-01-12 252
2021-01-13 252
2020-12-15 252
2020-12-13 252
2020-11-12 252
2020-12-12 252
2020-11-13 252
2020-11-14 252
2020-11-15 252
2020-11-16 252
2020-11-17 252
2020-11-18 252
2020-11-19 252
2020-11-20 252
2020-11-21 252
2020-11-22 252
2020-11-23 252
2020-05-09 252
2020-11-25 252
2020-11-26 252
2020-11-27 252
2020-11-28 252
2020-11-29 252
2020-11-30 252
2020-12-01 252
2020-12-02 252
2020-12-03 252
2020-12-04 252
2020-12-05 252
2020-12-06 252
2020-12-07 252
2020-12-08 252
2020-12-09 252
2020-12-10 252
2020-12-11 252
2020-05-08 252
2020-08-01 252
2020-05-10 252
2020-08-05 252
2020-08-12 252
2020-08-11 252
2020-08-10 252
2020-08-09 252
2020-08-08 252
2020-08-07 252
2020-08-06 252
2020-08-04 252
2020-08-14 252
2020-08-03 252
2020-08-02 252
2020-10-16 252
2020-07-31 252
2020-07-30 252
2020-06-18 252
2020-06-19 252
2020-08-13 252
2020-08-15 252
2020-07-23 252
2020-08-25 252
2020-09-01 252
2020-08-31 252
2020-08-30 252
2020-08-29 252
2020-08-28 252
2020-08-27 252
2020-08-26 252
2020-08-24 252
2020-08-16 252
2020-08-23 252
2020-08-22 252
2020-08-21 252
2020-08-20 252
2020-08-19 252
2020-08-18 252
2020-08-17 252
2020-06-20 252
2020-06-21 252
2020-07-29 252
2020-07-13 252
2020-07-06 252
2020-07-07 252
2020-07-08 252
2020-07-09 252
2020-07-10 252
2020-07-11 252
2020-07-12 252
2020-07-14 252
2020-07-28 252
2020-07-15 252
2020-07-16 252
2020-07-17 252
2020-07-18 252
2020-07-19 252
2020-07-20 252
2020-07-21 252
2020-07-05 252
2020-07-04 252
2020-07-03 252
2020-07-02 252
2020-05-11 252
2020-07-27 252
2020-07-26 252
2020-07-25 252
2020-06-22 252
2020-07-24 252
2020-06-23 252
2020-06-24 252
2020-06-25 252
2020-06-26 252
2020-06-27 252
2020-06-28 252
2020-06-29 252
2020-06-30 252
2020-07-01 252
2020-09-02 252
2020-09-03 252
2020-09-04 252
2020-06-07 252
2020-05-31 252
2020-06-01 252
2020-06-02 252
2020-06-03 252
2020-06-04 252
2020-06-05 252
2020-06-06 252
2020-06-08 252
2020-10-12 252
2020-06-09 252
2020-06-10 252
2020-06-11 252
2020-06-12 252
2020-06-13 252
2020-06-14 252
2020-06-15 252
2020-05-30 252
2020-05-29 252
2020-05-28 252
2020-05-27 252
2020-05-12 252
2020-05-13 252
2020-05-14 252
2020-10-13 252
2020-05-16 252
2020-05-17 252
2020-05-18 252
2020-05-19 252
2020-05-20 252
2020-05-21 252
2020-05-22 252
2020-05-23 252
2020-05-24 252
2020-05-25 252
2020-05-26 252
2020-06-16 252
2020-10-11 252
2020-09-05 252
2020-09-14 252
2020-09-21 252
2020-09-20 252
2020-09-19 252
2020-09-18 252
2020-09-17 252
2020-09-16 252
2020-09-15 252
2020-09-13 252
2020-06-17 252
2020-09-12 252
2020-09-11 252
2020-09-10 252
2020-09-09 252
2020-09-08 252
2020-09-07 252
2020-09-06 252
2020-09-22 252
2020-09-23 252
2020-09-24 252
2020-09-25 252
2020-10-10 252
2020-10-09 252
2020-10-08 252
2020-10-07 252
2020-10-06 252
2020-10-05 252
2020-10-04 252
2020-10-03 252
2020-10-02 252
2020-10-01 252
2020-09-30 252
2020-09-29 252
2020-09-28 252
2020-09-27 252
2020-09-26 252
2020-07-22 252
2020-03-29 251
2020-03-24 251
2020-05-15 251
2020-03-31 251
2020-03-30 251
2020-03-28 251
2020-03-27 251
2020-03-26 251
2020-03-25 251
2020-03-23 251
2020-03-22 251
2020-03-21 251
2020-03-20 251
2020-03-08 250
2020-03-10 250
2020-03-09 250
2020-03-13 250
2020-03-07 250
2020-03-12 250
2020-03-11 250
2020-03-16 250
2020-03-14 250
2020-03-15 250
2020-03-17 250
2020-03-18 250
2020-03-19 250
2020-03-06 249
2020-03-05 249
2020-03-04 249
2020-03-03 249
2020-03-02 249
2020-03-01 249
2020-02-23 248
2020-02-26 248
2020-02-24 248
2020-02-25 248
2020-02-20 248
2020-02-27 248
2020-02-28 248
2020-02-29 248
2020-02-21 248
2020-02-22 248
2023-04-08 248
2020-01-31 248
2020-02-08 248
2023-04-06 248
2023-04-07 248
2023-04-09 248
2020-02-19 248
2020-02-01 248
2020-02-02 248
2020-02-03 248
2020-02-04 248
2020-02-06 248
2020-02-07 248
2020-02-05 248
2020-02-09 248
2020-02-11 248
2020-02-12 248
2020-02-13 248
2020-02-14 248
2020-02-15 248
2020-02-10 248
2020-02-16 248
2020-02-17 248
2020-02-18 248
2020-01-17 247
2020-01-19 247
2020-01-18 247
2023-04-10 247
2020-01-16 247
2020-01-21 247
2020-01-20 247
2023-04-11 247
2020-01-22 247
2020-01-23 247
2020-01-24 247
2020-01-25 247
2020-01-26 247
2020-01-27 247
2020-01-28 247
2020-01-29 247
2020-01-30 247
2023-04-12 247
2020-01-03 246
2020-01-04 246
2020-01-15 246
2020-01-14 246
2020-01-13 246
2020-01-12 246
2020-01-11 246
2020-01-10 246
2020-01-09 246
2020-01-08 246
2020-01-07 246
2020-01-06 246
2020-01-05 246
2020-01-01 2
2020-01-02 2
Name: count, dtype: int64
--------------------------------------------------
Unique values in 'tests_units':
tests_units
tests performed 223167
people tested 52600
samples tested 25520
units unclear 1225
Name: count, dtype: int64
--------------------------------------------------
β
Identifies categorical columns
β
Displays all unique values along with their
counts
β
Includes NaN values for completeness
# Identify unique values in 'iso_code'
invalid_iso_codes = ['OWID_CYN', 'ESH'] # These are outliers
# Filter out invalid iso_codes
df = df[~df['iso_code'].isin(invalid_iso_codes)]
# Identify inconsistent location values (e.g., "Lower middle income", "Africa")
valid_locations = df['location'].value_counts().index.tolist()
# Define a function to check if a location is valid
def is_valid_location(loc):
invalid_terms = ["income", "World", "continent", "region"] # Broad terms that indicate issues
return not any(term.lower() in loc.lower() for term in invalid_terms)
# Apply filtering
df = df[df['location'].apply(is_valid_location)]
# Print cleaned categorical data summary
for col in cat_cols:
print(f"Column: {col}")
print(df[col].value_counts(dropna=False))
print("-" * 50)Column: iso_code
iso_code
ARG 1198
MEX 1198
AFG 1196
PSE 1196
NER 1196
NGA 1196
NIU 1196
OWID_NAM 1196
PRK 1196
MKD 1196
MNP 1196
NOR 1196
OWID_OCE 1196
OMN 1196
PAK 1196
PLW 1196
PAN 1196
NZL 1196
PNG 1196
PRY 1196
PER 1196
PHL 1196
PCN 1196
POL 1196
PRT 1196
PRI 1196
QAT 1196
REU 1196
ROU 1196
RUS 1196
NIC 1196
NLD 1196
NCL 1196
MUS 1196
OWID_AFR 1196
LTU 1196
LUX 1196
MDG 1196
MWI 1196
MYS 1196
MDV 1196
MLI 1196
MLT 1196
MHL 1196
MTQ 1196
MRT 1196
MYT 1196
BLM 1196
FSM 1196
MDA 1196
MCO 1196
MNG 1196
MNE 1196
MSR 1196
MAR 1196
MOZ 1196
MMR 1196
NAM 1196
NRU 1196
NPL 1196
RWA 1196
SHN 1196
LBR 1196
UKR 1196
THA 1196
TLS 1196
TGO 1196
TKL 1196
TON 1196
TTO 1196
TUN 1196
TUR 1196
TKM 1196
TCA 1196
TUV 1196
UGA 1196
ARE 1196
TJK 1196
GBR 1196
USA 1196
VIR 1196
URY 1196
UZB 1196
VUT 1196
VAT 1196
VEN 1196
VNM 1196
WLF 1196
YEM 1196
ZMB 1196
TZA 1196
SYR 1196
KNA 1196
SGP 1196
LCA 1196
MAF 1196
SPM 1196
VCT 1196
WSM 1196
SMR 1196
STP 1196
SAU 1196
SEN 1196
SRB 1196
SYC 1196
SLE 1196
SXM 1196
CHE 1196
SVK 1196
SVN 1196
SLB 1196
SOM 1196
ZAF 1196
OWID_SAM 1196
KOR 1196
SSD 1196
ESP 1196
LKA 1196
SDN 1196
SWE 1196
LBY 1196
LIE 1196
LSO 1196
BRN 1196
BFA 1196
BDI 1196
KHM 1196
CMR 1196
CAN 1196
CPV 1196
CYM 1196
CAF 1196
TCD 1196
CHL 1196
CHN 1196
COL 1196
COM 1196
COG 1196
COK 1196
CRI 1196
CIV 1196
HRV 1196
CUB 1196
CUW 1196
CYP 1196
CZE 1196
COD 1196
DNK 1196
DJI 1196
DMA 1196
LBN 1196
BGR 1196
VGB 1196
EGY 1196
BRA 1196
ALB 1196
DZA 1196
ASM 1196
AND 1196
AGO 1196
AIA 1196
ATG 1196
ARM 1196
ABW 1196
OWID_ASI 1196
AUS 1196
AUT 1196
AZE 1196
BHS 1196
BHR 1196
BGD 1196
BRB 1196
BLR 1196
BEL 1196
BLZ 1196
BEN 1196
BMU 1196
BTN 1196
BOL 1196
BES 1196
BIH 1196
BWA 1196
ECU 1196
DOM 1196
SLV 1196
ISR 1196
GNB 1196
GUY 1196
HTI 1196
HND 1196
HUN 1196
ISL 1196
IND 1196
IDN 1196
IRN 1196
IRQ 1196
IRL 1196
IMN 1196
ITA 1196
GTM 1196
JAM 1196
JPN 1196
JEY 1196
JOR 1196
KAZ 1196
KEN 1196
KIR 1196
OWID_KOS 1196
KWT 1196
KGZ 1196
LAO 1196
LVA 1196
GGY 1196
GIN 1196
GUF 1196
PYF 1196
GNQ 1196
ERI 1196
EST 1196
SWZ 1196
ETH 1196
OWID_EUR 1196
OWID_EUN 1196
FRO 1196
FLK 1196
FJI 1196
FIN 1196
FRA 1196
GUM 1196
ZWE 1196
GAB 1196
GIB 1196
GLP 1196
GRD 1196
GRL 1196
GMB 1196
GRC 1196
GHA 1196
DEU 1196
GEO 1196
SUR 1195
TWN 1183
HKG 1165
OWID_NIR 1131
OWID_SCT 1123
OWID_ENG 1112
OWID_WLS 1100
MAC 787
Name: count, dtype: int64
--------------------------------------------------
Column: continent
continent
Africa 71760
Europe 66658
Asia 60543
North America 50234
Oceania 29900
South America 16745
Name: count, dtype: int64
--------------------------------------------------
Column: location
location
Argentina 1198
Mexico 1198
Afghanistan 1196
Palestine 1196
Niger 1196
Nigeria 1196
Niue 1196
North America 1196
North Korea 1196
North Macedonia 1196
Northern Mariana Islands 1196
Norway 1196
Oceania 1196
Oman 1196
Pakistan 1196
Palau 1196
Panama 1196
New Zealand 1196
Papua New Guinea 1196
Paraguay 1196
Peru 1196
Philippines 1196
Pitcairn 1196
Poland 1196
Portugal 1196
Puerto Rico 1196
Qatar 1196
Reunion 1196
Romania 1196
Russia 1196
Nicaragua 1196
Netherlands 1196
New Caledonia 1196
Mauritius 1196
Africa 1196
Lithuania 1196
Luxembourg 1196
Madagascar 1196
Malawi 1196
Malaysia 1196
Maldives 1196
Mali 1196
Malta 1196
Marshall Islands 1196
Martinique 1196
Mauritania 1196
Mayotte 1196
Saint Barthelemy 1196
Micronesia (country) 1196
Moldova 1196
Monaco 1196
Mongolia 1196
Montenegro 1196
Montserrat 1196
Morocco 1196
Mozambique 1196
Myanmar 1196
Namibia 1196
Nauru 1196
Nepal 1196
Rwanda 1196
Saint Helena 1196
Liberia 1196
Ukraine 1196
Thailand 1196
Timor 1196
Togo 1196
Tokelau 1196
Tonga 1196
Trinidad and Tobago 1196
Tunisia 1196
Turkey 1196
Turkmenistan 1196
Turks and Caicos Islands 1196
Tuvalu 1196
Uganda 1196
United Arab Emirates 1196
Tajikistan 1196
United Kingdom 1196
United States 1196
United States Virgin Islands 1196
Uruguay 1196
Uzbekistan 1196
Vanuatu 1196
Vatican 1196
Venezuela 1196
Vietnam 1196
Wallis and Futuna 1196
Yemen 1196
Zambia 1196
Tanzania 1196
Syria 1196
Saint Kitts and Nevis 1196
Singapore 1196
Saint Lucia 1196
Saint Martin (French part) 1196
Saint Pierre and Miquelon 1196
Saint Vincent and the Grenadines 1196
Samoa 1196
San Marino 1196
Sao Tome and Principe 1196
Saudi Arabia 1196
Senegal 1196
Serbia 1196
Seychelles 1196
Sierra Leone 1196
Sint Maarten (Dutch part) 1196
Switzerland 1196
Slovakia 1196
Slovenia 1196
Solomon Islands 1196
Somalia 1196
South Africa 1196
South America 1196
South Korea 1196
South Sudan 1196
Spain 1196
Sri Lanka 1196
Sudan 1196
Sweden 1196
Libya 1196
Liechtenstein 1196
Lesotho 1196
Brunei 1196
Burkina Faso 1196
Burundi 1196
Cambodia 1196
Cameroon 1196
Canada 1196
Cape Verde 1196
Cayman Islands 1196
Central African Republic 1196
Chad 1196
Chile 1196
China 1196
Colombia 1196
Comoros 1196
Congo 1196
Cook Islands 1196
Costa Rica 1196
Cote d'Ivoire 1196
Croatia 1196
Cuba 1196
Curacao 1196
Cyprus 1196
Czechia 1196
Democratic Republic of Congo 1196
Denmark 1196
Djibouti 1196
Dominica 1196
Lebanon 1196
Bulgaria 1196
British Virgin Islands 1196
Egypt 1196
Brazil 1196
Albania 1196
Algeria 1196
American Samoa 1196
Andorra 1196
Angola 1196
Anguilla 1196
Antigua and Barbuda 1196
Armenia 1196
Aruba 1196
Asia 1196
Australia 1196
Austria 1196
Azerbaijan 1196
Bahamas 1196
Bahrain 1196
Bangladesh 1196
Barbados 1196
Belarus 1196
Belgium 1196
Belize 1196
Benin 1196
Bermuda 1196
Bhutan 1196
Bolivia 1196
Bonaire Sint Eustatius and Saba 1196
Bosnia and Herzegovina 1196
Botswana 1196
Ecuador 1196
Dominican Republic 1196
El Salvador 1196
Israel 1196
Guinea-Bissau 1196
Guyana 1196
Haiti 1196
Honduras 1196
Hungary 1196
Iceland 1196
India 1196
Indonesia 1196
Iran 1196
Iraq 1196
Ireland 1196
Isle of Man 1196
Italy 1196
Guatemala 1196
Jamaica 1196
Japan 1196
Jersey 1196
Jordan 1196
Kazakhstan 1196
Kenya 1196
Kiribati 1196
Kosovo 1196
Kuwait 1196
Kyrgyzstan 1196
Laos 1196
Latvia 1196
Guernsey 1196
Guinea 1196
French Guiana 1196
French Polynesia 1196
Equatorial Guinea 1196
Eritrea 1196
Estonia 1196
Eswatini 1196
Ethiopia 1196
Europe 1196
European Union 1196
Faeroe Islands 1196
Falkland Islands 1196
Fiji 1196
Finland 1196
France 1196
Guam 1196
Zimbabwe 1196
Gabon 1196
Gibraltar 1196
Guadeloupe 1196
Grenada 1196
Greenland 1196
Gambia 1196
Greece 1196
Ghana 1196
Germany 1196
Georgia 1196
Suriname 1195
Taiwan 1183
Hong Kong 1165
Northern Ireland 1131
Scotland 1123
England 1112
Wales 1100
Macao 787
Name: count, dtype: int64
--------------------------------------------------
Column: date
date
2021-08-24 248
2022-02-18 248
2022-03-06 248
2022-03-05 248
2022-03-04 248
2022-03-03 248
2022-03-02 248
2022-03-01 248
2022-02-28 248
2022-02-27 248
2022-02-26 248
2022-02-25 248
2022-02-24 248
2022-02-23 248
2022-02-22 248
2022-02-21 248
2022-02-20 248
2022-03-07 248
2022-03-08 248
2022-03-09 248
2022-03-18 248
2022-03-24 248
2022-03-23 248
2022-03-22 248
2022-03-21 248
2022-03-20 248
2022-03-19 248
2022-03-17 248
2022-03-10 248
2022-03-16 248
2022-03-15 248
2022-03-14 248
2022-03-13 248
2022-03-12 248
2022-03-11 248
2022-02-19 248
2022-02-17 248
2022-03-26 248
2022-02-16 248
2022-01-28 248
2022-01-27 248
2022-01-26 248
2022-01-25 248
2022-01-24 248
2022-01-23 248
2022-01-22 248
2022-01-21 248
2022-01-20 248
2022-01-19 248
2022-01-18 248
2022-01-17 248
2022-01-16 248
2022-01-15 248
2022-01-14 248
2022-01-29 248
2022-01-30 248
2022-01-31 248
2022-02-09 248
2022-02-15 248
2022-02-14 248
2022-02-13 248
2022-02-12 248
2022-02-11 248
2022-02-10 248
2022-02-08 248
2022-02-01 248
2022-02-07 248
2022-02-06 248
2022-02-05 248
2022-02-04 248
2022-02-03 248
2022-02-02 248
2022-03-25 248
2022-03-27 248
2022-01-12 248
2022-05-03 248
2022-05-19 248
2022-05-18 248
2022-05-17 248
2022-05-16 248
2022-05-15 248
2022-05-14 248
2022-05-13 248
2022-05-12 248
2022-05-11 248
2022-05-10 248
2022-05-09 248
2022-05-08 248
2022-05-07 248
2022-05-06 248
2022-05-05 248
2022-05-20 248
2022-05-21 248
2022-05-22 248
2022-05-31 248
2022-06-06 248
2022-06-05 248
2022-06-04 248
2022-06-03 248
2022-06-02 248
2022-06-01 248
2022-05-30 248
2022-05-23 248
2022-05-29 248
2022-05-28 248
2022-05-27 248
2022-05-26 248
2022-05-25 248
2022-05-24 248
2022-05-04 248
2022-05-02 248
2022-03-28 248
2022-05-01 248
2022-04-12 248
2022-04-11 248
2022-04-10 248
2022-04-09 248
2022-04-08 248
2022-04-07 248
2022-04-06 248
2022-04-05 248
2022-04-04 248
2022-04-03 248
2022-04-02 248
2022-04-01 248
2022-03-31 248
2022-03-30 248
2022-03-29 248
2022-04-13 248
2022-04-14 248
2022-04-15 248
2022-04-24 248
2022-04-30 248
2022-04-29 248
2022-04-28 248
2022-04-27 248
2022-04-26 248
2022-04-25 248
2022-04-23 248
2022-04-16 248
2022-04-22 248
2022-04-21 248
2022-04-20 248
2022-04-19 248
2022-04-18 248
2022-04-17 248
2022-01-13 248
2022-01-11 248
2022-06-08 248
2021-09-23 248
2021-10-09 248
2021-10-08 248
2021-10-07 248
2021-10-06 248
2021-10-05 248
2021-10-04 248
2021-10-03 248
2021-10-02 248
2021-10-01 248
2021-09-30 248
2021-09-29 248
2021-09-28 248
2021-09-27 248
2021-09-26 248
2021-09-25 248
2021-10-10 248
2021-10-11 248
2021-10-12 248
2021-10-21 248
2021-10-27 248
2021-10-26 248
2021-10-25 248
2021-10-24 248
2021-10-23 248
2021-10-22 248
2021-10-20 248
2021-10-13 248
2021-10-19 248
2021-10-18 248
2021-10-17 248
2021-10-16 248
2021-10-15 248
2021-10-14 248
2021-09-24 248
2021-09-22 248
2021-10-29 248
2021-09-21 248
2021-09-02 248
2021-09-01 248
2021-08-31 248
2021-08-30 248
2021-08-29 248
2021-08-28 248
2021-08-27 248
2021-08-26 248
2021-08-25 248
2023-04-01 248
2021-08-23 248
2021-08-22 248
2021-08-21 248
2021-08-20 248
2021-08-19 248
2021-09-03 248
2021-09-04 248
2021-09-05 248
2021-09-14 248
2021-09-20 248
2021-09-19 248
2021-09-18 248
2021-09-17 248
2021-09-16 248
2021-09-15 248
2021-09-13 248
2021-09-06 248
2021-09-12 248
2021-09-11 248
2021-09-10 248
2021-09-09 248
2021-09-08 248
2021-09-07 248
2021-10-28 248
2021-10-30 248
2022-01-10 248
2021-12-06 248
2021-12-22 248
2021-12-21 248
2021-12-20 248
2021-12-19 248
2021-12-18 248
2021-12-17 248
2021-12-16 248
2021-12-15 248
2021-12-14 248
2021-12-13 248
2021-12-12 248
2021-12-11 248
2021-12-10 248
2021-12-09 248
2021-12-08 248
2021-12-23 248
2021-12-24 248
2021-12-25 248
2022-01-03 248
2022-01-09 248
2022-01-08 248
2022-01-07 248
2022-01-06 248
2022-01-05 248
2022-01-04 248
2022-01-02 248
2021-12-26 248
2022-01-01 248
2021-12-31 248
2021-12-30 248
2021-12-29 248
2021-12-28 248
2021-12-27 248
2021-12-07 248
2021-12-05 248
2021-10-31 248
2021-12-04 248
2021-11-15 248
2021-11-14 248
2021-11-13 248
2021-11-12 248
2021-11-11 248
2021-11-10 248
2021-11-09 248
2021-11-08 248
2021-11-07 248
2021-11-06 248
2021-11-05 248
2021-11-04 248
2021-11-03 248
2021-11-02 248
2021-11-01 248
2021-11-16 248
2021-11-17 248
2021-11-18 248
2021-11-27 248
2021-12-03 248
2021-12-02 248
2021-12-01 248
2021-11-30 248
2021-11-29 248
2021-11-28 248
2021-11-26 248
2021-11-19 248
2021-11-25 248
2021-11-24 248
2021-11-23 248
2021-11-22 248
2021-11-21 248
2021-11-20 248
2022-06-07 248
2022-06-09 248
2021-08-17 248
2022-12-11 248
2022-12-27 248
2022-12-26 248
2022-12-25 248
2022-12-24 248
2022-12-23 248
2022-12-22 248
2022-12-21 248
2022-12-20 248
2022-12-19 248
2022-12-18 248
2022-12-17 248
2022-12-16 248
2022-12-15 248
2022-12-14 248
2022-12-13 248
2022-12-28 248
2022-12-29 248
2022-12-30 248
2023-01-08 248
2023-01-14 248
2023-01-13 248
2023-01-12 248
2023-01-11 248
2023-01-10 248
2023-01-09 248
2023-01-07 248
2022-12-31 248
2023-01-06 248
2023-01-05 248
2023-01-04 248
2023-01-03 248
2023-01-02 248
2023-01-01 248
2022-12-12 248
2022-12-10 248
2023-01-16 248
2022-12-09 248
2022-11-20 248
2022-11-19 248
2022-11-18 248
2022-11-17 248
2022-11-16 248
2022-11-15 248
2022-11-14 248
2022-11-13 248
2022-11-12 248
2022-11-11 248
2022-11-10 248
2022-11-09 248
2022-11-08 248
2022-11-07 248
2022-11-06 248
2022-11-21 248
2022-11-22 248
2022-11-23 248
2022-12-02 248
2022-12-08 248
2022-12-07 248
2022-12-06 248
2022-12-05 248
2022-12-04 248
2022-12-03 248
2022-12-01 248
2022-11-24 248
2022-11-30 248
2022-11-29 248
2022-11-28 248
2022-11-27 248
2022-11-26 248
2022-11-25 248
2023-01-15 248
2023-01-17 248
2022-11-04 248
2023-02-23 248
2023-03-11 248
2023-03-10 248
2023-03-09 248
2023-03-08 248
2023-03-07 248
2023-03-06 248
2023-03-05 248
2023-03-04 248
2023-03-03 248
2023-03-02 248
2023-03-01 248
2023-02-28 248
2023-02-27 248
2023-02-26 248
2023-02-25 248
2023-03-12 248
2023-03-13 248
2023-03-14 248
2023-03-23 248
2023-03-29 248
2023-03-28 248
2023-03-27 248
2023-03-26 248
2023-03-25 248
2023-03-24 248
2023-03-22 248
2023-03-15 248
2023-03-21 248
2023-03-20 248
2023-03-19 248
2023-03-18 248
2023-03-17 248
2023-03-16 248
2023-02-24 248
2023-02-22 248
2023-01-18 248
2023-02-21 248
2023-02-02 248
2023-02-01 248
2023-01-31 248
2023-01-30 248
2023-01-29 248
2023-01-28 248
2023-01-27 248
2023-01-26 248
2023-01-25 248
2023-01-24 248
2023-01-23 248
2023-01-22 248
2023-01-21 248
2023-01-20 248
2023-01-19 248
2023-02-03 248
2023-02-04 248
2023-02-05 248
2023-02-14 248
2023-02-20 248
2023-02-19 248
2023-02-18 248
2023-02-17 248
2023-02-16 248
2023-02-15 248
2023-02-13 248
2023-02-06 248
2023-02-12 248
2023-02-11 248
2023-02-10 248
2023-02-09 248
2023-02-08 248
2023-02-07 248
2022-11-05 248
2022-11-03 248
2022-06-10 248
2022-07-16 248
2022-08-01 248
2022-07-31 248
2022-07-30 248
2022-07-29 248
2022-07-28 248
2022-07-27 248
2022-07-26 248
2022-07-25 248
2022-07-24 248
2022-07-23 248
2022-07-22 248
2022-07-21 248
2022-07-20 248
2022-07-19 248
2022-07-18 248
2022-08-02 248
2022-08-03 248
2022-08-04 248
2022-08-13 248
2022-08-19 248
2022-08-18 248
2022-08-17 248
2022-08-16 248
2022-08-15 248
2022-08-14 248
2022-08-12 248
2022-08-05 248
2022-08-11 248
2022-08-10 248
2022-08-09 248
2022-08-08 248
2022-08-07 248
2022-08-06 248
2022-07-17 248
2022-07-15 248
2022-08-21 248
2022-07-14 248
2022-06-25 248
2022-06-24 248
2022-06-23 248
2022-06-22 248
2022-06-21 248
2022-06-20 248
2022-06-19 248
2022-06-18 248
2022-06-17 248
2022-06-16 248
2022-06-15 248
2022-06-14 248
2022-06-13 248
2022-06-12 248
2022-06-11 248
2022-06-26 248
2022-06-27 248
2022-06-28 248
2022-07-07 248
2022-07-13 248
2022-07-12 248
2022-07-11 248
2022-07-10 248
2022-07-09 248
2022-07-08 248
2022-07-06 248
2022-06-29 248
2022-07-05 248
2022-07-04 248
2022-07-03 248
2022-07-02 248
2022-07-01 248
2022-06-30 248
2022-08-20 248
2022-08-22 248
2022-11-02 248
2022-09-28 248
2022-10-14 248
2022-10-13 248
2022-10-12 248
2022-10-11 248
2022-10-10 248
2022-10-09 248
2022-10-08 248
2022-10-07 248
2022-10-06 248
2022-10-05 248
2022-10-04 248
2022-10-03 248
2022-10-02 248
2022-10-01 248
2022-09-30 248
2022-10-15 248
2022-10-16 248
2022-10-17 248
2022-10-26 248
2022-11-01 248
2022-10-31 248
2022-10-30 248
2022-10-29 248
2022-10-28 248
2022-10-27 248
2022-10-25 248
2022-10-18 248
2022-10-24 248
2022-10-23 248
2022-10-22 248
2022-10-21 248
2022-10-20 248
2022-10-19 248
2022-09-29 248
2022-09-27 248
2022-08-23 248
2022-09-26 248
2022-09-07 248
2022-09-06 248
2022-09-05 248
2022-09-04 248
2022-09-03 248
2022-09-02 248
2022-09-01 248
2022-08-31 248
2022-08-30 248
2022-08-29 248
2022-08-28 248
2022-08-27 248
2022-08-26 248
2022-08-25 248
2022-08-24 248
2022-09-08 248
2022-09-09 248
2022-09-10 248
2022-09-19 248
2022-09-25 248
2022-09-24 248
2022-09-23 248
2022-09-22 248
2022-09-21 248
2022-09-20 248
2022-09-18 248
2022-09-11 248
2022-09-17 248
2022-09-16 248
2022-09-15 248
2022-09-14 248
2022-09-13 248
2022-09-12 248
2021-08-18 248
2021-08-16 248
2023-03-31 248
2021-03-24 248
2021-04-15 248
2021-04-14 248
2021-04-13 248
2021-04-12 248
2021-04-11 248
2021-04-10 248
2021-04-09 248
2021-04-08 248
2021-04-07 248
2021-04-06 248
2021-04-05 248
2021-04-04 248
2021-04-03 248
2021-04-02 248
2021-04-01 248
2021-03-31 248
2021-03-30 248
2021-03-29 248
2021-03-28 248
2021-03-27 248
2021-03-26 248
2021-04-16 248
2021-04-17 248
2021-04-18 248
2021-04-30 248
2021-05-09 248
2021-05-08 248
2021-05-07 248
2021-05-06 248
2021-05-05 248
2021-05-04 248
2021-05-03 248
2021-05-02 248
2021-05-01 248
2021-04-29 248
2021-04-19 248
2021-04-28 248
2021-04-27 248
2021-04-26 248
2021-04-25 248
2021-04-24 248
2021-04-23 248
2021-04-22 248
2021-04-21 248
2021-04-20 248
2021-03-25 248
2021-03-23 248
2021-05-11 248
2021-03-22 248
2021-02-25 248
2021-02-24 248
2021-02-23 248
2021-02-22 248
2021-02-21 248
2021-02-20 248
2021-02-19 248
2021-02-18 248
2021-02-17 248
2021-02-16 248
2021-02-15 248
2021-02-14 248
2021-02-13 248
2021-02-12 248
2021-02-11 248
2021-02-10 248
2021-02-09 248
2021-02-08 248
2021-08-15 248
2023-04-02 248
2023-04-03 248
2021-02-26 248
2021-02-27 248
2021-02-28 248
2021-03-12 248
2021-03-21 248
2021-03-20 248
2021-03-19 248
2021-03-18 248
2021-03-17 248
2021-03-16 248
2021-03-15 248
2021-03-14 248
2021-03-13 248
2021-03-11 248
2021-03-01 248
2021-03-10 248
2021-03-09 248
2021-03-08 248
2021-03-07 248
2021-03-06 248
2021-03-05 248
2021-03-04 248
2021-03-03 248
2021-03-02 248
2021-05-10 248
2023-03-30 248
2021-05-12 248
2021-06-29 248
2021-07-21 248
2021-07-20 248
2021-07-19 248
2021-07-18 248
2021-07-17 248
2021-07-16 248
2021-07-15 248
2021-07-14 248
2021-07-13 248
2021-07-12 248
2021-07-11 248
2021-07-10 248
2021-07-09 248
2021-07-08 248
2021-07-07 248
2021-07-06 248
2021-07-05 248
2021-07-04 248
2021-07-03 248
2021-07-02 248
2021-07-01 248
2021-07-22 248
2021-07-23 248
2021-07-24 248
2021-08-06 248
2021-08-12 248
2021-05-13 248
2021-08-13 248
2021-08-14 248
2021-08-11 248
2021-08-10 248
2021-08-09 248
2021-08-08 248
2021-08-07 248
2021-08-04 248
2021-07-25 248
2021-08-03 248
2021-08-02 248
2021-08-01 248
2021-07-31 248
2021-07-30 248
2021-07-29 248
2021-07-28 248
2021-07-27 248
2021-07-26 248
2021-06-30 248
2021-08-05 248
2021-06-28 248
2021-05-24 248
2021-06-02 248
2021-06-01 248
2021-05-31 248
2021-05-30 248
2021-05-29 248
2021-05-28 248
2021-05-27 248
2021-05-26 248
2021-05-25 248
2021-05-23 248
2021-06-04 248
2021-05-22 248
2021-05-21 248
2021-05-20 248
2021-05-18 248
2021-05-17 248
2021-05-16 248
2021-05-15 248
2021-06-27 248
2021-05-14 248
2021-06-03 248
2021-05-19 248
2021-06-05 248
2021-06-25 248
2021-06-26 248
2021-06-24 248
2021-06-23 248
2021-06-22 248
2021-06-21 248
2021-06-20 248
2021-06-19 248
2021-06-06 248
2021-06-17 248
2021-06-18 248
2021-06-16 248
2021-06-15 248
2021-06-14 248
2021-06-13 248
2021-06-12 248
2021-06-11 248
2021-06-10 248
2021-06-09 248
2021-06-08 248
2021-06-07 248
2020-07-08 247
2020-07-09 247
2020-07-10 247
2020-07-11 247
2020-07-12 247
2020-07-13 247
2020-07-20 247
2020-07-14 247
2020-07-15 247
2020-07-16 247
2020-07-17 247
2020-07-18 247
2020-07-19 247
2020-07-06 247
2020-07-21 247
2020-07-22 247
2020-07-07 247
2020-06-26 247
2020-07-05 247
2020-06-24 247
2020-06-18 247
2020-07-24 247
2020-06-19 247
2020-06-20 247
2020-06-21 247
2020-06-22 247
2020-06-23 247
2020-06-25 247
2020-07-04 247
2020-06-27 247
2020-06-28 247
2020-06-29 247
2020-06-30 247
2020-07-01 247
2020-07-02 247
2020-07-03 247
2020-07-23 247
2020-08-15 247
2020-07-25 247
2020-08-23 247
2020-08-16 247
2020-08-17 247
2020-08-18 247
2020-08-19 247
2020-08-20 247
2020-08-21 247
2020-08-22 247
2020-08-24 247
2020-07-26 247
2020-08-25 247
2020-08-26 247
2020-08-27 247
2020-08-28 247
2020-08-29 247
2020-08-30 247
2020-06-16 247
2020-08-14 247
2020-08-13 247
2020-08-12 247
2020-08-11 247
2020-07-27 247
2020-07-28 247
2020-07-29 247
2020-07-30 247
2020-07-31 247
2020-08-01 247
2020-08-02 247
2020-08-03 247
2020-08-04 247
2020-08-05 247
2020-08-06 247
2020-08-07 247
2020-08-08 247
2020-08-09 247
2020-08-10 247
2020-06-17 247
2020-04-30 247
2020-06-15 247
2020-04-26 247
2020-04-19 247
2020-04-20 247
2020-04-21 247
2020-04-22 247
2020-04-23 247
2020-04-24 247
2020-04-25 247
2020-04-27 247
2020-04-17 247
2020-04-28 247
2020-04-29 247
2020-09-01 247
2020-05-01 247
2020-05-02 247
2020-05-03 247
2020-05-04 247
2020-04-18 247
2020-04-16 247
2020-06-14 247
2020-04-06 247
2023-04-05 247
2023-04-04 247
2020-04-01 247
2020-04-02 247
2020-04-03 247
2020-04-04 247
2020-04-05 247
2020-04-07 247
2020-04-15 247
2020-04-08 247
2020-04-09 247
2020-04-10 247
2020-04-11 247
2020-04-12 247
2020-04-13 247
2020-04-14 247
2020-05-05 247
2020-05-06 247
2020-05-07 247
2020-06-05 247
2020-05-29 247
2020-05-30 247
2020-05-31 247
2020-06-01 247
2020-06-02 247
2020-06-03 247
2020-06-04 247
2020-06-06 247
2020-05-08 247
2020-06-07 247
2020-06-08 247
2020-06-09 247
2020-06-10 247
2020-06-11 247
2020-06-12 247
2020-06-13 247
2020-05-28 247
2020-05-27 247
2020-05-26 247
2020-05-25 247
2020-05-09 247
2020-05-10 247
2020-05-11 247
2020-05-12 247
2020-05-13 247
2020-05-14 247
2020-05-16 247
2020-05-17 247
2020-05-18 247
2020-05-19 247
2020-05-20 247
2020-05-21 247
2020-05-22 247
2020-05-23 247
2020-05-24 247
2020-08-31 247
2020-09-06 247
2020-09-02 247
2020-12-21 247
2020-12-14 247
2020-12-15 247
2020-12-16 247
2020-12-17 247
2020-12-18 247
2020-12-19 247
2020-12-20 247
2020-12-22 247
2020-12-12 247
2020-12-23 247
2020-12-24 247
2020-12-25 247
2020-12-26 247
2020-12-27 247
2020-12-28 247
2020-12-29 247
2020-12-13 247
2020-12-11 247
2020-12-31 247
2020-12-01 247
2020-11-24 247
2020-11-25 247
2020-11-26 247
2020-11-27 247
2020-11-28 247
2020-11-29 247
2020-11-30 247
2020-12-02 247
2020-12-10 247
2020-12-03 247
2020-12-04 247
2020-12-05 247
2020-12-06 247
2020-12-07 247
2020-12-08 247
2020-12-09 247
2020-12-30 247
2021-01-01 247
2020-11-22 247
2021-01-30 247
2021-01-23 247
2021-01-24 247
2021-01-25 247
2021-01-26 247
2021-01-27 247
2021-01-28 247
2021-01-29 247
2021-01-31 247
2021-01-21 247
2021-02-01 247
2021-02-02 247
2021-02-03 247
2021-02-04 247
2021-02-05 247
2021-02-06 247
2021-02-07 247
2021-01-22 247
2021-01-20 247
2020-09-03 247
2021-01-10 247
2021-01-03 247
2021-01-04 247
2021-01-05 247
2021-01-06 247
2021-01-07 247
2021-01-08 247
2021-01-09 247
2021-01-11 247
2021-01-19 247
2021-01-12 247
2021-01-13 247
2021-01-14 247
2021-01-15 247
2021-01-16 247
2021-01-17 247
2021-01-18 247
2020-11-23 247
2021-01-02 247
2020-11-21 247
2020-10-12 247
2020-09-24 247
2020-09-25 247
2020-09-26 247
2020-09-27 247
2020-09-28 247
2020-09-29 247
2020-09-30 247
2020-10-01 247
2020-10-02 247
2020-10-03 247
2020-10-04 247
2020-10-05 247
2020-10-06 247
2020-10-07 247
2020-10-08 247
2020-10-09 247
2020-10-10 247
2020-09-23 247
2020-09-22 247
2020-09-21 247
2020-09-11 247
2020-09-04 247
2020-09-05 247
2020-09-07 247
2020-09-08 247
2020-11-20 247
2020-09-09 247
2020-09-10 247
2020-09-12 247
2020-09-20 247
2020-09-13 247
2020-09-14 247
2020-09-15 247
2020-09-16 247
2020-09-17 247
2020-09-18 247
2020-09-19 247
2020-10-11 247
2020-10-25 247
2020-10-13 247
2020-11-01 247
2020-11-04 247
2020-11-05 247
2020-11-06 247
2020-11-07 247
2020-11-08 247
2020-11-09 247
2020-11-10 247
2020-11-11 247
2020-11-12 247
2020-11-13 247
2020-11-14 247
2020-11-18 247
2020-11-17 247
2020-11-16 247
2020-10-14 247
2020-11-03 247
2020-11-02 247
2020-10-31 247
2020-11-19 247
2020-10-15 247
2020-10-16 247
2020-10-17 247
2020-10-18 247
2020-10-19 247
2020-10-20 247
2020-10-22 247
2020-10-21 247
2020-10-23 247
2020-10-24 247
2020-10-26 247
2020-10-27 247
2020-10-28 247
2020-10-29 247
2020-10-30 247
2020-11-15 247
2020-03-23 246
2020-03-20 246
2020-03-21 246
2020-03-22 246
2020-03-24 246
2020-03-25 246
2020-03-26 246
2020-03-30 246
2020-03-28 246
2020-03-31 246
2020-05-15 246
2020-03-29 246
2020-03-27 246
2020-03-15 245
2020-03-14 245
2020-03-13 245
2020-03-16 245
2020-03-11 245
2020-03-10 245
2020-03-09 245
2020-03-08 245
2020-03-07 245
2020-03-17 245
2020-03-18 245
2020-03-19 245
2020-03-12 245
2020-03-04 244
2020-03-01 244
2020-03-06 244
2020-03-05 244
2020-03-03 244
2020-03-02 244
2020-02-29 243
2020-02-02 243
2020-02-07 243
2020-02-06 243
2020-02-05 243
2020-02-04 243
2020-02-03 243
2020-01-31 243
2020-02-28 243
2020-02-09 243
2023-04-06 243
2023-04-07 243
2023-04-08 243
2023-04-09 243
2020-02-08 243
2020-02-01 243
2020-02-10 243
2020-02-11 243
2020-02-27 243
2020-02-26 243
2020-02-25 243
2020-02-24 243
2020-02-23 243
2020-02-22 243
2020-02-21 243
2020-02-20 243
2020-02-12 243
2020-02-13 243
2020-02-14 243
2020-02-15 243
2020-02-16 243
2020-02-17 243
2020-02-18 243
2020-02-19 243
2020-01-28 242
2020-01-30 242
2023-04-12 242
2023-04-11 242
2023-04-10 242
2020-01-16 242
2020-01-27 242
2020-01-17 242
2020-01-18 242
2020-01-19 242
2020-01-20 242
2020-01-29 242
2020-01-22 242
2020-01-23 242
2020-01-24 242
2020-01-25 242
2020-01-26 242
2020-01-21 242
2020-01-03 241
2020-01-04 241
2020-01-15 241
2020-01-14 241
2020-01-13 241
2020-01-12 241
2020-01-11 241
2020-01-10 241
2020-01-09 241
2020-01-08 241
2020-01-07 241
2020-01-06 241
2020-01-05 241
2020-01-01 2
2020-01-02 2
Name: count, dtype: int64
--------------------------------------------------
Column: tests_units
tests_units
tests performed 216495
people tested 52600
samples tested 25520
units unclear 1225
Name: count, dtype: int64
--------------------------------------------------
To solve inconsistencies and outliers in the categorical data, we applyed the following steps:
OWID_CYN and ESH are not valid
iso_code values, we can drop them.location:
location should contain only country/region names, but
values like "Lower middle income" and "Africa"
indicate misclassified data.β
Removes outlier iso_code values
(OWID_CYN and ESH).
β
Filters out incorrect location values
(like "Lower middle income", "Africa",
etc.).
β
Ensures only valid country/region names remain in
location.
df['date'] = pd.to_datetime(df['date'], errors='coerce')β
pd.to_datetime(df['date']) converts the column to a
proper datetime format.
β
errors='coerce' ensures that any invalid values (e.g.,
wrong formats) become NaT (Not a Time), preventing
errors.
# Rename columns
df = df.rename(columns={
'iso_code': 'country_code',
'location': 'country',
'total_cases': 'total_confirmed_cases',
'new_cases': 'new_confirmed_cases',
'total_deaths': 'total_deaths_reported',
'new_deaths': 'new_deaths_reported'
})Rename columns for better readability and consistency.
df.info()<class 'pandas.core.frame.DataFrame'>
Index: 295840 entries, 0 to 302511
Data columns (total 67 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 country_code 295840 non-null object
1 continent 295840 non-null object
2 country 295840 non-null object
3 date 295840 non-null datetime64[ns]
4 total_confirmed_cases 295840 non-null float64
5 new_confirmed_cases 295840 non-null float64
6 new_cases_smoothed 295840 non-null float64
7 total_deaths_reported 295840 non-null float64
8 new_deaths_reported 295840 non-null float64
9 new_deaths_smoothed 295840 non-null float64
10 total_cases_per_million 295840 non-null float64
11 new_cases_per_million 295840 non-null float64
12 new_cases_smoothed_per_million 295840 non-null float64
13 total_deaths_per_million 295840 non-null float64
14 new_deaths_per_million 295840 non-null float64
15 new_deaths_smoothed_per_million 295840 non-null float64
16 reproduction_rate 295840 non-null float64
17 icu_patients 295840 non-null float64
18 icu_patients_per_million 295840 non-null float64
19 hosp_patients 295840 non-null float64
20 hosp_patients_per_million 295840 non-null float64
21 weekly_icu_admissions 295840 non-null float64
22 weekly_icu_admissions_per_million 295840 non-null float64
23 weekly_hosp_admissions 295840 non-null float64
24 weekly_hosp_admissions_per_million 295840 non-null float64
25 total_tests 295840 non-null float64
26 new_tests 295840 non-null float64
27 total_tests_per_thousand 295840 non-null float64
28 new_tests_per_thousand 295840 non-null float64
29 new_tests_smoothed 295840 non-null float64
30 new_tests_smoothed_per_thousand 295840 non-null float64
31 positive_rate 295840 non-null float64
32 tests_per_case 295840 non-null float64
33 tests_units 295840 non-null object
34 total_vaccinations 295840 non-null float64
35 people_vaccinated 295840 non-null float64
36 people_fully_vaccinated 295840 non-null float64
37 total_boosters 295840 non-null float64
38 new_vaccinations 295840 non-null float64
39 new_vaccinations_smoothed 295840 non-null float64
40 total_vaccinations_per_hundred 295840 non-null float64
41 people_vaccinated_per_hundred 295840 non-null float64
42 people_fully_vaccinated_per_hundred 295840 non-null float64
43 total_boosters_per_hundred 295840 non-null float64
44 new_vaccinations_smoothed_per_million 295840 non-null float64
45 new_people_vaccinated_smoothed 295840 non-null float64
46 new_people_vaccinated_smoothed_per_hundred 295840 non-null float64
47 stringency_index 295840 non-null float64
48 population_density 295840 non-null float64
49 median_age 295840 non-null float64
50 aged_65_older 295840 non-null float64
51 aged_70_older 295840 non-null float64
52 gdp_per_capita 295840 non-null float64
53 extreme_poverty 295840 non-null float64
54 cardiovasc_death_rate 295840 non-null float64
55 diabetes_prevalence 295840 non-null float64
56 female_smokers 295840 non-null float64
57 male_smokers 295840 non-null float64
58 handwashing_facilities 295840 non-null float64
59 hospital_beds_per_thousand 295840 non-null float64
60 life_expectancy 295840 non-null float64
61 human_development_index 295840 non-null float64
62 population 295840 non-null float64
63 excess_mortality_cumulative_absolute 295840 non-null float64
64 excess_mortality_cumulative 295840 non-null float64
65 excess_mortality 295840 non-null float64
66 excess_mortality_cumulative_per_million 295840 non-null float64
dtypes: datetime64[ns](1), float64(62), object(4)
memory usage: 153.5+ MB
# Select numerical columns
numerical_cols = df.select_dtypes(include=[np.number]).columns
n_cols = len(numerical_cols)
n_rows = math.ceil(n_cols / 4) # Adjust the number of rows based on the number of columns
plt.figure(figsize=(20, 5 * n_rows)) # Adjust figure size based on the number of rows
for i, col in enumerate(numerical_cols, 1):
plt.subplot(n_rows, 4, i)
sns.boxplot(y=df[col])
plt.title(col)
plt.tight_layout()
plt.show()
from scipy.stats import zscore
# Calculate Z-scores for numerical columns
z_scores = df[numerical_cols].apply(zscore)
# Define a threshold (e.g., 3 or -3)
threshold = 3
outliers_z = (z_scores > threshold) | (z_scores < -threshold)
# Count outliers in each column
print("Outliers detected using Z-Score:")
print(outliers_z.sum())Outliers detected using Z-Score:
total_confirmed_cases 4269
new_confirmed_cases 1569
new_cases_smoothed 1564
total_deaths_reported 6123
new_deaths_reported 4001
new_deaths_smoothed 4106
total_cases_per_million 6690
new_cases_per_million 1866
new_cases_smoothed_per_million 3621
total_deaths_per_million 6286
new_deaths_per_million 3053
new_deaths_smoothed_per_million 5349
reproduction_rate 1113
icu_patients 2119
icu_patients_per_million 6180
hosp_patients 2046
hosp_patients_per_million 4079
weekly_icu_admissions 2401
weekly_icu_admissions_per_million 2963
weekly_hosp_admissions 9758
weekly_hosp_admissions_per_million 3832
total_tests 518
new_tests 2759
total_tests_per_thousand 5848
new_tests_per_thousand 5898
new_tests_smoothed 1149
new_tests_smoothed_per_thousand 3184
positive_rate 7901
tests_per_case 589
total_vaccinations 3632
people_vaccinated 3362
people_fully_vaccinated 3411
total_boosters 4640
new_vaccinations 1790
new_vaccinations_smoothed 1756
total_vaccinations_per_hundred 1058
people_vaccinated_per_hundred 0
people_fully_vaccinated_per_hundred 0
total_boosters_per_hundred 3405
new_vaccinations_smoothed_per_million 6337
new_people_vaccinated_smoothed 1640
new_people_vaccinated_smoothed_per_hundred 5740
stringency_index 0
population_density 4344
median_age 0
aged_65_older 2392
aged_70_older 2392
gdp_per_capita 5571
extreme_poverty 2392
cardiovasc_death_rate 1196
diabetes_prevalence 4784
female_smokers 0
male_smokers 2392
handwashing_facilities 0
hospital_beds_per_thousand 8372
life_expectancy 0
human_development_index 0
population 4784
excess_mortality_cumulative_absolute 8036
excess_mortality_cumulative 3854
excess_mortality 5019
excess_mortality_cumulative_per_million 5224
dtype: int64
# Apply winsorization to cap extreme values
df[numerical_cols] = df[numerical_cols].apply(lambda x: winsorize(x, limits=[0.01, 0.01])) # Caps top & bottom 1%πΉ Keep Outliers If:
πΉ Remove Outliers If:
If We decide to deal with outliers, here are the best approaches for our dataset:
β 1. Winsorization (Capping Outliers)
β 2. Transforming Data (Log or Square Root Transform)
β 3. Removing Outliers Based on Z-Score or IQR
Since our dataset includes real-world COVID-19 data,
the best approach is to use Winsorization or log transformation
rather than direct removal because:
1οΈβ£ COVID-19 cases & deaths have natural extreme
spikes.
2οΈβ£ Removing outliers may hide crucial trends (waves, lockdown
effects).
3οΈβ£ Smoothing extreme values is better than removing them
entirely.
# Calculate Z-scores for numerical columns
z_scores = df[numerical_cols].apply(zscore)
# Define a threshold (e.g., 3 or -3)
threshold = 3
outliers_z = (z_scores > threshold) | (z_scores < -threshold)
# Count outliers in each column
print("Outliers detected using Z-Score:")
print(outliers_z.sum())Outliers detected using Z-Score:
total_confirmed_cases 5337
new_confirmed_cases 6543
new_cases_smoothed 6702
total_deaths_reported 7215
new_deaths_reported 6627
new_deaths_smoothed 6567
total_cases_per_million 6869
new_cases_per_million 7602
new_cases_smoothed_per_million 7674
total_deaths_per_million 6592
new_deaths_per_million 8613
new_deaths_smoothed_per_million 9056
reproduction_rate 0
icu_patients 12520
icu_patients_per_million 7859
hosp_patients 0
hosp_patients_per_million 5069
weekly_icu_admissions 5435
weekly_icu_admissions_per_million 4564
weekly_hosp_admissions 10595
weekly_hosp_admissions_per_million 4741
total_tests 7934
new_tests 8230
total_tests_per_thousand 6546
new_tests_per_thousand 6563
new_tests_smoothed 6676
new_tests_smoothed_per_thousand 7098
positive_rate 7901
tests_per_case 4863
total_vaccinations 5208
people_vaccinated 4846
people_fully_vaccinated 4798
total_boosters 5211
new_vaccinations 6333
new_vaccinations_smoothed 6228
total_vaccinations_per_hundred 0
people_vaccinated_per_hundred 0
people_fully_vaccinated_per_hundred 0
total_boosters_per_hundred 3415
new_vaccinations_smoothed_per_million 7593
new_people_vaccinated_smoothed 6022
new_people_vaccinated_smoothed_per_hundred 9025
stringency_index 0
population_density 5540
median_age 0
aged_65_older 0
aged_70_older 0
gdp_per_capita 5571
extreme_poverty 0
cardiovasc_death_rate 0
diabetes_prevalence 4784
female_smokers 0
male_smokers 0
handwashing_facilities 0
hospital_beds_per_thousand 8372
life_expectancy 0
human_development_index 0
population 5980
excess_mortality_cumulative_absolute 8036
excess_mortality_cumulative 3678
excess_mortality 7339
excess_mortality_cumulative_per_million 5224
dtype: int64
β We Solve the problem of Outliers we are good to go.
df.describe(include='all')| country_code | continent | country | date | total_confirmed_cases | new_confirmed_cases | new_cases_smoothed | total_deaths_reported | new_deaths_reported | new_deaths_smoothed | total_cases_per_million | new_cases_per_million | new_cases_smoothed_per_million | total_deaths_per_million | new_deaths_per_million | new_deaths_smoothed_per_million | reproduction_rate | icu_patients | icu_patients_per_million | hosp_patients | hosp_patients_per_million | weekly_icu_admissions | weekly_icu_admissions_per_million | weekly_hosp_admissions | weekly_hosp_admissions_per_million | total_tests | new_tests | total_tests_per_thousand | new_tests_per_thousand | new_tests_smoothed | new_tests_smoothed_per_thousand | positive_rate | tests_per_case | tests_units | total_vaccinations | people_vaccinated | people_fully_vaccinated | total_boosters | new_vaccinations | new_vaccinations_smoothed | total_vaccinations_per_hundred | people_vaccinated_per_hundred | people_fully_vaccinated_per_hundred | total_boosters_per_hundred | new_vaccinations_smoothed_per_million | new_people_vaccinated_smoothed | new_people_vaccinated_smoothed_per_hundred | stringency_index | population_density | median_age | aged_65_older | aged_70_older | gdp_per_capita | extreme_poverty | cardiovasc_death_rate | diabetes_prevalence | female_smokers | male_smokers | handwashing_facilities | hospital_beds_per_thousand | life_expectancy | human_development_index | population | excess_mortality_cumulative_absolute | excess_mortality_cumulative | excess_mortality | excess_mortality_cumulative_per_million | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 295840 | 295840 | 295840 | 295840 | 2.958400e+05 | 295840.000000 | 295840.000000 | 2.958400e+05 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 2.958400e+05 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840 | 2.958400e+05 | 2.958400e+05 | 2.958400e+05 | 2.958400e+05 | 2.958400e+05 | 2.958400e+05 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.00000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 295840.000000 | 2.958400e+05 | 2.958400e+05 | 295840.000000 | 295840.000000 | 295840.000000 |
| unique | 248 | 6 | 248 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| top | ARG | Africa | Argentina | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | tests performed | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| freq | 1198 | 71760 | 1198 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 216495 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| mean | NaN | NaN | NaN | 2021-08-23 13:55:08.729042944 | 3.294126e+06 | 3635.738247 | 3704.414733 | 4.774526e+04 | 37.930527 | 38.733286 | 92596.267365 | 128.229897 | 140.757851 | 848.157616 | 0.884128 | 0.945825 | 0.752354 | 219.250882 | 5.212657 | 2984.668493 | 80.188060 | 72.420673 | 3.708227 | 1522.244355 | 53.871693 | 2.249765e+07 | 28121.147739 | 1249.748769 | 2.355675 | 33146.061060 | 1.784718 | 0.142600 | 132.981153 | NaN | 9.991498e+07 | 4.187658e+07 | 3.890271e+07 | 2.646347e+07 | 7.137484e+04 | 5.885654e+04 | 123.097460 | 53.530107 | 49.582787 | 33.543533 | 1525.56921 | 20538.642090 | 0.054311 | 34.424177 | 308.207316 | 30.034724 | 8.514619 | 5.268092 | 19076.362702 | 14.234671 | 263.110822 | 8.600771 | 11.395997 | 31.541347 | 49.450266 | 3.068764 | 73.646679 | 0.719877 | 5.323516e+07 | 6.957121e+04 | 13.511943 | 10.712484 | 2233.221401 |
| min | NaN | NaN | NaN | 2020-01-01 00:00:00 | 4.000000e+00 | 0.000000 | 0.000000 | 1.000000e+00 | 0.000000 | 0.000000 | 1.017000 | 0.000000 | 0.000000 | 0.321000 | 0.000000 | 0.000000 | -0.010000 | 0.000000 | 0.000000 | 22.000000 | 6.562000 | 1.000000 | 0.552000 | 5.000000 | 1.893000 | 7.930000e+02 | 8.000000 | 0.497000 | 0.003000 | 14.000000 | 0.006000 | 0.000000 | 1.100000 | NaN | 1.970000e+02 | 5.490000e+02 | 1.150000e+03 | 4.000000e+01 | 0.000000e+00 | 0.000000e+00 | 0.040000 | 0.040000 | 0.070000 | 0.000000 | 0.00000 | 0.000000 | 0.000000 | 0.000000 | 3.078000 | 16.400000 | 1.307000 | 1.114000 | 752.788000 | 0.100000 | 85.755000 | 1.910000 | 0.200000 | 8.500000 | 1.188000 | 0.300000 | 54.330000 | 0.398000 | 1.893000e+03 | -4.658800e+03 | -10.340000 | -36.510000 | -1034.877900 |
| 25% | NaN | NaN | NaN | 2020-10-29 00:00:00 | 7.342750e+03 | 0.000000 | 0.714000 | 1.230000e+02 | 0.000000 | 0.000000 | 2492.635000 | 0.000000 | 0.177000 | 57.555000 | 0.000000 | 0.000000 | 0.380000 | 8.000000 | 1.423000 | 385.000000 | 26.645000 | 7.000000 | 1.434000 | 162.000000 | 23.683000 | 3.950400e+05 | 1278.000000 | 68.836750 | 0.222000 | 598.750000 | 0.124000 | 0.016100 | 4.800000 | NaN | 3.800000e+05 | 2.052590e+05 | 1.848010e+05 | 6.133800e+04 | 5.220000e+02 | 9.100000e+01 | 48.210000 | 31.070000 | 27.120000 | 6.330000 | 42.00000 | 10.000000 | 0.000000 | 11.110000 | 39.497000 | 21.700000 | 3.526000 | 2.063000 | 4227.630000 | 0.600000 | 176.957000 | 5.460000 | 1.700000 | 21.000000 | 19.351000 | 1.300000 | 69.020000 | 0.594000 | 4.099890e+05 | 2.317750e+02 | 6.110000 | -5.950000 | 651.846070 |
| 50% | NaN | NaN | NaN | 2021-08-24 00:00:00 | 6.659900e+04 | 14.000000 | 33.286000 | 1.325000e+03 | 0.000000 | 0.143000 | 24861.928000 | 2.166000 | 9.319000 | 389.266000 | 0.000000 | 0.024000 | 0.850000 | 29.000000 | 2.844000 | 550.000000 | 55.594000 | 30.000000 | 1.619000 | 515.000000 | 41.806000 | 2.870685e+06 | 4118.000000 | 411.155000 | 0.613000 | 3338.000000 | 0.545000 | 0.080300 | 12.100000 | NaN | 3.739158e+06 | 2.239434e+06 | 1.993200e+06 | 7.798590e+05 | 5.221000e+03 | 1.313000e+03 | 120.180000 | 59.710000 | 54.120000 | 28.660000 | 295.00000 | 231.000000 | 0.005000 | 27.500000 | 99.110000 | 29.100000 | 6.211000 | 3.519000 | 12895.635000 | 2.200000 | 243.811000 | 7.200000 | 6.300000 | 30.200000 | 44.600000 | 2.400000 | 75.000000 | 0.738000 | 5.637022e+06 | 5.963201e+03 | 13.190000 | 2.890000 | 1753.307300 |
| 75% | NaN | NaN | NaN | 2022-06-18 00:00:00 | 6.305112e+05 | 439.000000 | 527.857000 | 9.854000e+03 | 5.000000 | 5.429000 | 118749.429000 | 68.085500 | 102.821750 | 1309.040000 | 0.389000 | 0.723000 | 1.050000 | 165.000000 | 8.174000 | 3350.000000 | 113.283750 | 77.000000 | 3.797000 | 1291.000000 | 67.691000 | 1.234731e+07 | 19244.000000 | 1433.869000 | 1.983000 | 15032.750000 | 1.729000 | 0.196200 | 50.000000 | NaN | 2.276167e+07 | 1.067369e+07 | 9.792266e+06 | 6.936989e+06 | 3.515300e+04 | 1.670425e+04 | 193.410000 | 77.600000 | 71.980000 | 54.670000 | 1653.00000 | 4137.000000 | 0.045000 | 50.970000 | 237.012000 | 38.700000 | 13.260000 | 8.160000 | 26808.164000 | 23.500000 | 336.717000 | 10.790000 | 20.100000 | 41.100000 | 82.502000 | 4.000000 | 79.190000 | 0.828000 | 2.620798e+07 | 4.980240e+04 | 20.400000 | 15.210000 | 3305.707800 |
| max | NaN | NaN | NaN | 2023-04-12 00:00:00 | 1.165863e+08 | 125143.000000 | 122485.857000 | 1.398618e+06 | 1287.000000 | 1294.143000 | 599142.341000 | 2602.864000 | 2416.903000 | 4617.392000 | 15.549000 | 12.994000 | 1.770000 | 2089.000000 | 55.417000 | 14975.000000 | 409.450000 | 464.000000 | 20.936000 | 14366.000000 | 221.355000 | 3.926741e+08 | 409271.000000 | 14707.401000 | 39.495000 | 699733.000000 | 22.221000 | 0.952300 | 4566.400000 | NaN | 3.415934e+09 | 1.302773e+09 | 1.272830e+09 | 8.283965e+08 | 1.732373e+06 | 1.700593e+06 | 298.090000 | 105.820000 | 105.820000 | 128.220000 | 15745.00000 | 709346.000000 | 0.694000 | 90.740000 | 7915.731000 | 47.900000 | 23.021000 | 16.240000 | 104861.851000 | 71.700000 | 597.029000 | 27.250000 | 43.000000 | 65.800000 | 100.000000 | 13.050000 | 84.860000 | 0.955000 | 1.425887e+09 | 1.282260e+06 | 51.740000 | 166.230000 | 10066.715000 |
| std | NaN | NaN | NaN | NaN | 1.433953e+07 | 15945.612323 | 15797.481707 | 1.929127e+05 | 164.719089 | 165.323935 | 140934.885507 | 372.025550 | 356.389151 | 1058.946871 | 2.427710 | 2.188158 | 0.445604 | 407.447529 | 8.071772 | 4510.269729 | 74.258545 | 90.221415 | 3.894764 | 2997.028427 | 40.987913 | 5.856635e+07 | 65431.507862 | 2320.403624 | 5.735614 | 92938.518533 | 3.362456 | 0.184563 | 571.393907 | NaN | 4.142194e+08 | 1.651486e+08 | 1.583957e+08 | 1.073620e+08 | 2.258135e+05 | 2.209618e+05 | 83.971049 | 29.004464 | 28.092447 | 29.342011 | 2766.29427 | 87392.760171 | 0.119391 | 23.759028 | 959.414433 | 9.029077 | 6.005725 | 3.999579 | 20207.929852 | 20.104251 | 121.463806 | 4.963716 | 11.459313 | 13.467883 | 32.232231 | 2.541954 | 7.297570 | 0.147177 | 1.939979e+08 | 1.981024e+05 | 11.517135 | 30.179084 | 2226.839542 |
fig = px.sunburst(df,
path=["continent", "country"],
values="total_confirmed_cases",
color="total_deaths_reported",
title="π COVID-19 Spread Across Continents & Countries")
fig.show()# Select a subset of columns for the Parallel Coordinates Chart
df_parallel = df[['country', 'total_confirmed_cases', 'total_deaths_reported', 'total_vaccinations', 'population']]
# Create Parallel Coordinates Chart
fig_parallel = px.parallel_coordinates(df_parallel, color='total_confirmed_cases',
title='Parallel Coordinates: COVID-19 Metrics by Country',
labels={'total_confirmed_cases': 'Total Cases', 'total_deaths_reported': 'Total Deaths', 'total_vaccinations': 'Total Vaccinations', 'population': 'Population'},
template='plotly_dark')
# Show the plot
fig_parallel.show()Output hidden; open in https://colab.research.google.com to view.
# Create 3D Scatter Plot
fig_3d = px.scatter_3d(df, x='total_confirmed_cases', y='total_deaths_reported', z='total_vaccinations',
color='continent', title='3D Scatter Plot: Cases, Deaths, and Vaccinations',
labels={'total_confirmed_cases': 'Total Cases', 'total_deaths_reported': 'Total Deaths', 'total_vaccinations': 'Total Vaccinations'},
template='plotly_dark')
# Show the plot
fig_3d.show()Output hidden; open in https://colab.research.google.com to view.
# Create Animated Bubble Chart
fig_bubble = px.scatter(df, x='total_confirmed_cases', y='total_deaths_reported', size='population',
color='continent', animation_frame=df['date'].dt.strftime('%Y-%m-%d'),
title='Animated Bubble Chart: Cases and Deaths Over Time',
labels={'total_confirmed_cases': 'Total Cases', 'total_deaths_reported': 'Total Deaths', 'population': 'Population'},
template='plotly_dark')
# Adjust animation speed
fig_bubble.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] = 100
fig_bubble.layout.updatemenus[0].buttons[0].args[1]['transition']['duration'] = 50
# Show the plot
fig_bubble.show()Output hidden; open in https://colab.research.google.com to view.
fig = px.scatter(df,
x="total_confirmed_cases",
y="total_deaths_reported",
size="population",
color="continent",
hover_name="country",
title="β‘ Cases vs. Deaths vs. Population Bubble Chart")
fig.show()Output hidden; open in https://colab.research.google.com to view.
fig = px.choropleth(df,
locations="country",
locationmode="country names",
color="new_tests_per_thousand",
hover_name="country",
animation_frame=df['date'].astype(str),
title="π§ͺ COVID-19 Testing Rate Per Thousand",
color_continuous_scale="Purples")
fig.show()fig = px.treemap(df,
path=["continent", "country"],
values="total_tests",
color="positive_rate",
title="π§ͺ COVID-19 Testing & Positivity Rate")
fig.show()# Set seaborn style
sns.set(style="whitegrid")
### Daily & Cumulative Confirmed Cases Over Time ###
plt.figure(figsize=(12, 6))
sns.lineplot(data=df, x='date', y='total_confirmed_cases', label='Total Confirmed Cases', color='blue')
sns.lineplot(data=df, x='date', y='new_confirmed_cases', label='New Cases', color='red')
plt.xlabel("Date")
plt.ylabel("Cases")
plt.title("COVID-19 Confirmed Cases Over Time")
plt.legend()
plt.xticks(rotation=45)
plt.show()
### Top 10 Countries with Highest Cases ###
top_countries = df.groupby('country')['total_confirmed_cases'].max().sort_values(ascending=False).head(10)
plt.figure(figsize=(10, 5))
sns.barplot(x=top_countries.values, y=top_countries.index, palette="Reds_r")
plt.xlabel("Total Confirmed Cases")
plt.ylabel("Country")
plt.title("Top 10 Countries with Highest COVID-19 Cases")
plt.show()
### Case Fatality Rate (CFR%) Trend ###
df["CFR"] = (df["total_deaths_reported"] / df["total_confirmed_cases"]) * 100
plt.figure(figsize=(12, 6))
sns.lineplot(data=df, x='date', y='CFR', color='purple')
plt.xlabel("Date")
plt.ylabel("Case Fatality Rate (%)")
plt.title("COVID-19 Case Fatality Rate Over Time")
plt.show()
### Vaccination Progress (Top 10 Countries) ###
top_vaccine_countries = df.groupby('country')['total_vaccinations'].max().sort_values(ascending=False).head(10)
plt.figure(figsize=(10, 5))
sns.barplot(x=top_vaccine_countries.values, y=top_vaccine_countries.index, palette="Blues_r")
plt.xlabel("Total Vaccinations")
plt.ylabel("Country")
plt.title("Top 10 Countries by Total Vaccinations")
plt.show()
### Top 10 Countries with Highest Cases - Interactive Bar Chart ###
top_countries = df.groupby('country')['total_confirmed_cases'].max().sort_values(ascending=False).head(10)
fig = px.bar(x=top_countries.values, y=top_countries.index,
orientation='h',
title="Top 10 Countries with Highest COVID-19 Cases",
labels={'x': 'Total Confirmed Cases', 'y': 'Country'},
color=top_countries.values, color_continuous_scale='Reds')
fig.update_layout(yaxis={'categoryorder': 'total ascending'})
fig.show()### Vaccination Progress - Top 10 Countries ###
top_vaccine_countries = df.groupby('country')['total_vaccinations'].max().sort_values(ascending=False).head(10)
fig = px.bar(x=top_vaccine_countries.values, y=top_vaccine_countries.index,
orientation='h',
title="Top 10 Countries by Total Vaccinations",
labels={'x': 'Total Vaccinations', 'y': 'Country'},
color=top_vaccine_countries.values, color_continuous_scale='Blues')
fig.update_layout(yaxis={'categoryorder': 'total ascending'})
fig.show()
# Create a choropleth map for total confirmed cases
fig = px.choropleth(df,
locations="country",
locationmode="country names",
color="total_confirmed_cases",
hover_name="country",
animation_frame=df['date'].astype(str),
title="Global COVID-19 Cases Over Time",
color_continuous_scale="Reds")
fig.show()Output hidden; open in https://colab.research.google.com to view.
# Get top 10 affected countries
top_countries = df.groupby("country")["total_confirmed_cases"].max().nlargest(10).index
df_top = df[df["country"].isin(top_countries)]
# Create animated bar chart
fig = px.bar(df_top,
x="total_confirmed_cases",
y="country",
color="country",
animation_frame=df_top['date'].astype(str),
title="π COVID-19 Cases Growth Over Time - Top 10 Countries",
labels={'total_confirmed_cases': 'Total Cases', 'country': 'Country'},
orientation='h')
fig.update_layout(yaxis={'categoryorder': 'total ascending'})
fig.show()Output hidden; open in https://colab.research.google.com to view.
fig = px.choropleth(df,
locations="country",
locationmode="country names",
color="people_fully_vaccinated_per_hundred",
hover_name="country",
animation_frame=df['date'].astype(str),
title="π Global COVID-19 Vaccination Progress Over Time",
color_continuous_scale="Blues")
fig.show()Output hidden; open in https://colab.research.google.com to view.
fig = px.choropleth(df,
locations="country",
locationmode="country names",
color="total_deaths_per_million",
hover_name="country",
animation_frame=df['date'].astype(str),
title="β°οΈ COVID-19 Mortality Rate Per Million",
color_continuous_scale="OrRd")
fig.show()Output hidden; open in https://colab.research.google.com to view.
# Group by date and calculate daily new cases and deaths
df_daily = df.groupby('date').agg({
'new_confirmed_cases': 'sum',
'new_deaths_reported': 'sum'
}).reset_index()
# Plot daily new cases and deaths
plt.figure(figsize=(14, 6))
plt.plot(df_daily['date'], df_daily['new_confirmed_cases'], label='Daily New Cases')
plt.plot(df_daily['date'], df_daily['new_deaths_reported'], label='Daily New Deaths')
plt.title('Daily New COVID-19 Cases and Deaths')
plt.xlabel('Date')
plt.ylabel('Count')
plt.legend()
plt.grid()
plt.show()
# Group by date and calculate total vaccinations
df_vaccination = df.groupby('date').agg({
'total_vaccinations': 'sum',
'people_vaccinated': 'sum',
'people_fully_vaccinated': 'sum'
}).reset_index()
# Plot vaccination progress
plt.figure(figsize=(14, 6))
plt.plot(df_vaccination['date'], df_vaccination['total_vaccinations'], label='Total Vaccinations')
plt.plot(df_vaccination['date'], df_vaccination['people_vaccinated'], label='People Vaccinated')
plt.plot(df_vaccination['date'], df_vaccination['people_fully_vaccinated'], label='People Fully Vaccinated')
plt.title('Global COVID-19 Vaccination Progress')
plt.xlabel('Date')
plt.ylabel('Count')
plt.legend()
plt.grid()
plt.show()
# Group by date and calculate total cases and deaths
df_grouped = df.groupby('date').agg({
'total_confirmed_cases': 'sum',
'total_deaths_reported': 'sum'
}).reset_index()
# Plot total cases and deaths over time
plt.figure(figsize=(14, 6))
plt.plot(df_grouped['date'], df_grouped['total_confirmed_cases'], label='Total Confirmed Cases')
plt.plot(df_grouped['date'], df_grouped['total_deaths_reported'], label='Total Deaths Reported')
plt.title('Global COVID-19 Cases and Deaths Over Time')
plt.xlabel('Date')
plt.ylabel('Count')
plt.legend()
plt.grid()
plt.show()
# Group by continent and calculate total cases and deaths
df_continent = df.groupby('continent').agg({
'total_confirmed_cases': 'sum',
'total_deaths_reported': 'sum'
}).reset_index()
# Plot total cases by continent
plt.figure(figsize=(14, 6))
sns.barplot(x='continent', y='total_confirmed_cases', data=df_continent, palette='coolwarm')
plt.title('Total COVID-19 Cases by Continent')
plt.xlabel('Continent')
plt.ylabel('Total Confirmed Cases')
plt.show()
# Plot total deaths by continent
plt.figure(figsize=(14, 6))
sns.barplot(x='continent', y='total_deaths_reported', data=df_continent, palette='coolwarm')
plt.title('Total COVID-19 Deaths by Continent')
plt.xlabel('Continent')
plt.ylabel('Total Deaths Reported')
plt.show()

df.to_csv('/content/drive/My Drive/Data Sets/cleaned_covid_data.csv', index=False)After preprocessing, we saved the cleaned DataFrame as a new CSV.
π Total Cases & Deaths:
π Vaccination Impact:
π Testing & Positive Rate:
π ICU & Hospitalization Trends:
π Moving Averages (Week, Month, Quarter, Year):
π Seasonality & Waves:
π Top 5 Most Affected Countries:
π Regional Differences in Mortality & Recovery:
π Lockdown & Policy Impact:
β
Cases vs. Testing: More testing leads to higher
case detection, reducing underreporting risks.
β
Vaccination vs. Death Rate: Higher vaccine coverage
significantly reduces fatalities.
β
ICU Admissions vs. Healthcare Capacity: Countries
with fewer ICU beds faced greater strain during peak surges.
πΉ Early Testing & Containment: Rapid testing
can prevent unchecked outbreaks.
πΉ Localized Lockdowns: Stringent policies in high-risk
areas can reduce case surges.
πΉ Booster Campaigns: Rolling out booster shots in
high-risk regions can curb case spikes.
πΉ Global Equity in Vaccines: Some countries lag behind
in vaccine availability, requiring support.
πΉ ICU & Hospital Capacity Planning: Investing
in healthcare infrastructure can mitigate future crises.
πΉ Medical Supply Chain Optimization: Ensuring
availability of PPE, ventilators, and essential drugs is crucial.
πΉ Masking & Social Distancing Campaigns: Public
adherence improves when policies are clearly communicated.
πΉ Misinformation Control: Governments and health
agencies must combat false narratives around COVID-19.
πΉ Real-Time Monitoring Dashboards: Governments and
organizations should use dynamic dashboards to track case trends,
hospitalizations, and vaccinations.
πΉ AI & Predictive Analysis: Leveraging machine
learning can help predict future outbreaks based on existing
patterns.
This COVID-19 Analysis Report provides a comprehensive understanding of the pandemicβs impact, highlighting key trends, challenges, and actionable insights. With interactive dashboards, governments, healthcare professionals, and policymakers can make data-driven decisions to better manage future outbreaks.